scispace - formally typeset
Search or ask a question
Book ChapterDOI

Behavioral Study of Defocus Cue Preserving Image Compression Algorithm for Depth Perception

TL;DR: This paper presents a behavioral study to state that the images compressed using defocus cue preserving compression yields better depth perception as compared to standard JPEG compression.
Abstract: Image and video processing is currently active research field during the past few years. Different coding schemes are available in the literature for image and video compression to improve compression ratio while maintaining picture quality. Many of the algorithms use ROI coding such as saliency based concept using different image features. But very few works related to depth cues preserving compression. In this paper, we present a behavioral study to state that the images compressed using defocus cue preserving compression yields better depth perception as compared to standard JPEG compression. We compare images compressed using different schemes against the original image. We collect data from different participants by showing original and compressed images to them. The responses are analyzed using analysis of variance. The analysis shows that the images compressed using defocus cue based compression provides the better perception of the raw image as compared to standard JPEG compressed image.
References
More filters
Journal ArticleDOI
TL;DR: The Baseline method has been by far the most widely implemented JPEG method to date, and is sufficient in its own right for a large number of applications.
Abstract: For the past few years, a joint ISO/CCITT committee known as JPEG (Joint Photographic Experts Group) has been working to establish the first international compression standard for continuous-tone still images, both grayscale and color. JPEG’s proposed standard aims to be generic, to support a wide variety of applications for continuous-tone images. To meet the differing needs of many applications, the JPEG standard includes two basic compression methods, each with various modes of operation. A DCT-based method is specified for “lossy’’ compression, and a predictive method for “lossless’’ compression. JPEG features a simple lossy technique known as the Baseline method, a subset of the other DCT-based modes of operation. The Baseline method has been by far the most widely implemented JPEG method to date, and is sufficient in its own right for a large number of applications. This article provides an overview of the JPEG standard, and focuses in detail on the Baseline method.

3,944 citations

Proceedings ArticleDOI
17 Jun 2007
TL;DR: A simple method for the visual saliency detection is presented, independent of features, categories, or other forms of prior knowledge of the objects, and a fast method to construct the corresponding saliency map in spatial domain is proposed.
Abstract: The ability of human visual system to detect visual saliency is extraordinarily fast and reliable. However, computational modeling of this basic intelligent behavior still remains a challenge. This paper presents a simple method for the visual saliency detection. Our model is independent of features, categories, or other forms of prior knowledge of the objects. By analyzing the log-spectrum of an input image, we extract the spectral residual of an image in spectral domain, and propose a fast method to construct the corresponding saliency map in spatial domain. We test this model on both natural pictures and artificial images such as psychological patterns. The result indicate fast and robust saliency detection of our method.

3,464 citations

Journal ArticleDOI
TL;DR: Four combination strategies are compared using three databases of natural color images and it is found that strategy (4) and its simplified, computationally efficient approximation yielded significantly better performance than (1), with up to fourfold improvement, while preserving generality.
Abstract: Bottom-up or saliency-based visual attention allows primates to detect nonspecific conspicuous targets in cluttered scenes. A classical metaphor, derived from electrophysiological and psychophysical studies, describes attention as a rapidly shiftable “spotlight.” We use a model that reproduces the attentional scan paths of this spotlight. Simple multi-scale “feature maps” detect local spatial discontinuities in intensity, color, and orientation, and are combined into a unique “master” or “saliency” map. The saliency map is sequentially scanned, in order of decreasing saliency, by the focus of attention. We here study the problem of combining feature maps, from different visual modalities (such as color and orientation), into a unique saliency map. Four combination strategies are compared using three databases of natural color images: (1) Simple normalized summation, (2) linear combination with learned weights, (3) global nonlinear normalization followed by summation, and (4) local nonlinear competition between salient locations followed by summation. Performance was measured as the number of false detections before the most salient target was found. Strategy (1) always yielded poorest performance and (2) best performance, with a threefold to eightfold improvement in time to find a salient target. However, (2) yielded specialized systems with poor generalization. Interestingly, strategy (4) and its simplified, computationally efficient approximation (3) yielded significantly better performance than (1), with up to fourfold improvement, while preserving generality.

479 citations

Proceedings ArticleDOI
18 Mar 2015
TL;DR: How RAISE has been collected and organized is described, how digital image forensics and many other multimedia research areas may benefit of this new publicly available benchmark dataset and a very recent forensic technique for JPEG compression detection is tested.
Abstract: Digital forensics is a relatively new research area which aims at authenticating digital media by detecting possible digital forgeries. Indeed, the ever increasing availability of multimedia data on the web, coupled with the great advances reached by computer graphical tools, makes the modification of an image and the creation of visually compelling forgeries an easy task for any user. This in turns creates the need of reliable tools to validate the trustworthiness of the represented information. In such a context, we present here RAISE, a large dataset of 8156 high-resolution raw images, depicting various subjects and scenarios, properly annotated and available together with accompanying metadata. Such a wide collection of untouched and diverse data is intended to become a powerful resource for, but not limited to, forensic researchers by providing a common benchmark for a fair comparison, testing and evaluation of existing and next generation forensic algorithms. In this paper we describe how RAISE has been collected and organized, discuss how digital image forensics and many other multimedia research areas may benefit of this new publicly available benchmark dataset and test a very recent forensic technique for JPEG compression detection.

440 citations

Proceedings ArticleDOI
17 Jun 2006
TL;DR: Testing on 750 artificial and natural scenes shows that the model’s predictions are consistent with a large body of available literature on human psychophysics of visual search, suggesting that it may provide good approximation of how humans combine bottom-up and top-down cues.
Abstract: Integration of goal-driven, top-down attention and image-driven, bottom-up attention is crucial for visual search. Yet, previous research has mostly focused on models that are purely top-down or bottom-up. Here, we propose a new model that combines both. The bottom-up component computes the visual salience of scene locations in different feature maps extracted at multiple spatial scales. The topdown component uses accumulated statistical knowledge of the visual features of the desired search target and background clutter, to optimally tune the bottom-up maps such that target detection speed is maximized. Testing on 750 artificial and natural scenes shows that the model’s predictions are consistent with a large body of available literature on human psychophysics of visual search. These results suggest that our model may provide good approximation of how humans combine bottom-up and top-down cues such as to optimize target detection speed.

435 citations