scispace - formally typeset
Search or ask a question

Showing papers by "Alan C. Bovik published in 2003"


Proceedings ArticleDOI
09 Nov 2003
TL;DR: This paper proposes a multiscale structural similarity method, which supplies more flexibility than previous single-scale methods in incorporating the variations of viewing conditions, and develops an image synthesis method to calibrate the parameters that define the relative importance of different scales.
Abstract: The structural similarity image quality paradigm is based on the assumption that the human visual system is highly adapted for extracting structural information from the scene, and therefore a measure of structural similarity can provide a good approximation to perceived image quality. This paper proposes a multiscale structural similarity method, which supplies more flexibility than previous single-scale methods in incorporating the variations of viewing conditions. We develop an image synthesis method to calibrate the parameters that define the relative importance of different scales. Experimental comparisons demonstrate the effectiveness of the proposed method.

4,333 citations


Proceedings Article
01 Dec 2003
TL;DR: This paper proposes a multi-scale structural similarity method, which supplies more flexibility than previous single-scale methods in incorporating the variations of viewing conditions, and develops an image synthesis method to calibrate the parameters that define the relative importance of different scales.
Abstract: The structural similarity image quality paradigm is based on the assumption that the human visual system is highly adapted for extracting structural information from the scene, and therefore a measure of structural similarity can provide a good approximation to perceived image quality. This paper proposes a multi-scale structural similarity method, which supplies more flexibility than previous single-scale methods in incorporating the variations of viewing conditions. We develop an image synthesis method to calibrate the parameters that define the relative importance of different scales. Experimental comparisons demonstrate the effectiveness of the proposed method.

1,205 citations


01 Jan 2003
TL;DR: It is imperative for a video service system to be able to realize and quantify the video quality degradations that occur in the system, so that it can maintain, control and possibly enhance the quality of the video data.
Abstract: Digital video data, stored in video databases and distributed through communication networks, is subject to various kinds of distortions during acquisition, compression, processing, transmission, and reproduction. For example, lossy video compression techniques, which are almost always used to reduce the bandwidth needed to store or transmit video data, may degrade the quality during the quantization process. For another instance, the digital video bitstreams delivered over error-prone channels, such as wireless channels, may be received imperfectly due to the impairment occurred during transmission. Package-switched communication networks, such as the Internet, can cause loss or severe delay of received data packages, depending on the network conditions and the quality of services. All these transmission errors may result in distortions in the received video data. It is therefore imperative for a video service system to be able to realize and quantify the video quality degradations that occur in the system, so that it can maintain, control and possibly enhance the quality of the video data. An effective image and video quality metric is crucial for this purpose.

350 citations


Journal ArticleDOI
TL;DR: A foveation scalable video coding (FSVC) algorithm which supplies good quality-compression performance as well as effective rate scalability, and is adaptable to different applications, such as knowledge-based video coding and video communications over time-varying, multiuser and interactive networks.
Abstract: Image and video coding is an optimization problem. A successful image and video coding algorithm delivers a good tradeoff between visual quality and other coding performance measures, such as compression, complexity, scalability, robustness, and security. In this paper, we follow two recent trends in image and video coding research. One is to incorporate human visual system (HVS) models to improve the current state-of-the-art of image and video coding algorithms by better exploiting the properties of the intended receiver. The other is to design rate scalable image and video codecs, which allow the extraction of coded visual information at continuously varying bit rates from a single compressed bitstream. Specifically, we propose a foveation scalable video coding (FSVC) algorithm which supplies good quality-compression performance as well as effective rate scalability. The key idea is to organize the encoded bitstream to provide the best decoded video at an arbitrary bit rate in terms of foveated visual quality measurement. A foveation-based HVS model plays an important role in the algorithm. The algorithm is adaptable to different applications, such as knowledge-based video coding and video communications over time-varying, multiuser and interactive networks.

212 citations


Journal ArticleDOI
TL;DR: This approach leads to enhanced computational efficiency by interpreting nonuniform-density foveated images on the uniform domain and by using a foveation protocol between the encoder and the decoder.
Abstract: This paper explores the problem of communicating high-quality, foveated video streams in real time. Foveated video exploits the nonuniform resolution of the human visual system by preferentially allocating bits according to the proximity to assumed visual fixation points, thus delivering perceptually high quality at greatly reduced bandwidths. Foveated video streams possess specific data density properties that can be exploited to enhance the efficiency of subsequent video processing. Here, we exploit these properties to construct several efficient foveated video processing algorithms: foveation filtering (local bandwidth reduction), motion estimation, motion compensation, video rate control, and video postprocessing. Our approach leads to enhanced computational efficiency by interpreting nonuniform-density foveated images on the uniform domain and by using a foveation protocol between the encoder and the decoder.

73 citations


Journal ArticleDOI
TL;DR: It is demonstrated that foveation in the DCT domain can actually result in computational speed-ups, and can be incorporated into standard motion compensation and discrete cosine transform (DCT)-based video coding techniques for low bit rate video coding, without incurring prohibitive complexity overhead.
Abstract: Lossy video compression methods often rely on modeling the abilities and limitations of the intended receiver, the human visual system (HVS), to achieve the highest possible compression with as little effect on perceived quality as possible. Foveation, which is non-uniform resolution perception of the visual stimulus by the HVS due to the non-uniform density of photoreceptor cells in the eye, has been demonstrated to be useful for reducing bit rates beyond the abilities of uniform resolution video coders. In this work, we present real-time foveation techniques for low bit rate video coding. First, we develop an approximate model for foveation. Then, we demonstrate that foveation, as described by this model, can be incorporated into standard motion compensation and discrete cosine transform (DCT)-based video coding techniques for low bit rate video coding, such as the H.263 or MPEG-4 video coding standards, without incurring prohibitive complexity overhead. We demonstrate that foveation in the DCT domain can actually result in computational speed-ups. The techniques presented can be implemented using the baseline modes in the video coding standards and do not require any modification to, or post-processing at, the decoder.

46 citations


Proceedings ArticleDOI
01 Dec 2003
TL;DR: A new technique for the detection of spiculated masses in digitized mammograms using a new algorithm for the enhancement and a new set of linear image filters which are created for the Detection stage.
Abstract: In this paper, we present a new technique for the detection of spiculated masses in digitized mammograms. The techniques consist of two stages, enhancement of spiculations followed by the detection of the location where they converge. We describe a new algorithm for the enhancement and a new set of linear image filters which we have created for the detection stage. We have tested the algorithm on digitized mammograms obtained from the digital database for screening mammography (DDSM). Results of the detection algorithm are shown. Finally we show that the algorithm may be modified for the detection of architectural distortions.

42 citations


Proceedings ArticleDOI
24 Nov 2003
TL;DR: A data-driven approach that uses eye tracking in tandem with principal component analysis to extract low-level image features that attract human gaze that resemble derivatives of the 2D Gaussian operator is described.
Abstract: The ability to automatically detect 'visually interesting' regions in an image has many practical applications especially in the design of active machine vision systems. This paper describes a data-driven approach that uses eye tracking in tandem with principal component analysis to extract low-level image features that attract human gaze. Data analysis on an ensemble of image patches extracted at the observer's point of gaze revealed features that resemble derivatives of the 2D Gaussian operator. Dissimilarities between human and random fixations are investigated by comparing the features extracted at the point of gaze to the general image structure obtained by random sampling in Monte-Carlo simulations. Finally, a simple application where these features are used to predict fixations is illustrated.

39 citations


Proceedings ArticleDOI
01 Dec 2003
TL;DR: The retinal layers of a monkey were imaged using a polarization sensitive optical coherence tomography (PS-OCT) system in an effort to develop a clinically reliable automatic diagnostic system for glaucoma.
Abstract: The retinal layers of a monkey were imaged using a polarization sensitive optical coherence tomography (PS-OCT) system in an effort to develop a clinically reliable automatic diagnostic system for glaucoma. Glaucoma is characterized by the progressive loss of ganglion cells and axons in the retinal nerve fiber layer (RNFL). Automatic segmentation of the RNFL from the PS-OCT images is a fundamental step to diagnose the progress of the disease. Due to the use of a coherent light, speckle noise is inherent in the images. Wavelet denoising techniques with a combination of image processing techniques were applied to remove the speckle noise in the PS-OCT images, and a fuzzy logic classifier was used to segment the RNFL. A significant signal to noise ratio improvement was observed qualitatively and quantitatively after the denoising. The upper boundary for the RNFL was reliably detected, but the lower boundary detection still remains as a problem.

12 citations


Proceedings ArticleDOI
01 Dec 2003
TL;DR: This paper presents a blind quality assessment algorithm for images compressed by JPEG2000 using natural scene statistics (NSS) modelling and shows how reasonably comprehensive NSS models can help in making blind, but accurate, predictions of quality.
Abstract: Measurement of image quality is crucial for many image-processing algorithms, such as acquisition, compression, restoration, enhancement and reproduction. Traditionally, researchers in image quality assessment have focused on equating image quality with similarity to a 'reference' or 'perfect' image. The field of blind, or no-reference, quality assessment, in which image quality is predicted without the reference image, has been largely unexplored. In this paper, we present a blind quality assessment algorithm for images compressed by JPEG2000 using natural scene statistics (NSS) modelling. We show how reasonably comprehensive NSS models can help us in making blind, but accurate, predictions of quality. Our algorithm performs close to the limit imposed on useful prediction by the variability between human subjects.

12 citations


Journal ArticleDOI
01 May 2003
TL;DR: Foveation-based error resilience and unequal error protection techniques over highly error-prone mobile networks are introduced and a foveation based bitstream partitioning is developed in order to alleviate the degradation of visual quality.
Abstract: By exploiting new human-machine interface techniques, such as visual eyetrackers, it should be possible to develop more efficient visual multimedia services associated with low bandwidth, dynamic channel adaptation and robust visual data transmission. In this paper, we introduce foveation-based error resilience and unequal error protection techniques over highly error-prone mobile networks. Each frame is spatially divided into foveated and background layers according to perceptual importance. Perceptual importance is determined either through an eye tracker or by manually selecting a region of interest. We attempt to improve reconstructed visual quality by maintaining the high visual source throughput of the foveated layer using foveation-based error resilience and error correction using a combination of turbo codes and ARQ (automatic reQuest). In order to alleviate the degradation of visual quality, a foveation based bitstream partitioning is developed. In an effort to increase the source throughput of the foveated layer, we develop unequal delay-constrained ARQ (automatic reQuest) and rate compatible punctured turbo codes where the punctual pattern of RCPC (rate compatible punctured convolutional) codes in H.223 Annex C is used. In the simulation, the visual quality is significantly increased in the area of interest using foveation-based error resilience and unequal error protections (as much as 3 dB FPSNR (foveal peak signal to noise ratio) improvement) at 40% packet error rate. Over real-fading statistics measured in the downtown area of Austin, Texas, the visual quality is increased up to 1.5 dB in PSNR and 1.8 dB in FPSNR at a channel SNR of 5 dB.

01 Jan 2003
TL;DR: A new technique for the detection of spiculated masses in digitized mammograms using a new filtering algorithm and a new set of linear image filters called Radial Spiculation Filters, which has been proposed.
Abstract: In this paper, we present a new technique for the detection of spiculated masses in digitized mammo- grams. The techniques consists of two stages, enhance- ment of spiculations followed by the detection of the location where they converge. We describe a new algorithm for the enhancement and a new set of linear image filters which we have created for the detection stage. We have tested the al- gorithm on digitized mammograms obtained from the Digi- tal Database for Screening Mammography (DDSM). Results of the detection algorithm are shown. Finally we show that the algorithm may be modified for the detection of architec- tural distortions. The American Cancer Society estimates that in 2003, 211,300 women will be affected by breast cancer and 39,800 will die due to it (lj. Early detection of cancer helps save lives and screening mammography is the most effec- tive tool for early detection of breast cancer (l). Screening Mammography involves the detection of abnormalities like lesions which are characterized by their shape and margin. Spiculated lesions are highly suspicious signs of breast can- cer (2). By definition, a spiculated lesion is characterized by lines (or spiculations) radiating from the margins of the mass (3). In this paper, we propose a new algorithm for the detec- tion of spiculated masses in digitized mammograms. The algorithm consists of two steps. Enhancement of certain features using a new filtering algorithm followed by detec- tion of the enhanced featcures using a novel set of linear filters called Radial Spiculation Filters. The filtering algorithm aims to enhance the linear features of masses called spiculations. This is done by computing the radon transform of the image, followed by filtering and thresholding in the radon domain and computing an in- verse radon transform to obtain the enhanced image. The Radial Spiculation filters are designed to find the spatial lo- cation where the spiculations converge. The organization of the paper is as follows: Section I1 describes the theory of the filtering algorithm and the new set of filters called Radial Spiculation Filters. The methodology and the data sets are described in Section 111. Section IV presents the results and the conclusion is described in Section V. 11. THEORY