scispace - formally typeset
Search or ask a question

Showing papers by "Alan C. Bovik published in 2001"


Journal ArticleDOI
TL;DR: An embedded foveation image coding (EFIC) algorithm, which orders the encoded bitstream to optimize foveated visual quality at arbitrary bit-rates, is proposed and a modified SPIHT algorithm is developed to improve the coding efficiency.
Abstract: The human visual system (HVS) is highly space-variant in sampling, coding, processing, and understanding. The spatial resolution of the HVS is highest around the point of fixation (foveation point) and decreases rapidly with increasing eccentricity. By taking advantage of this fact, it is possible to remove considerable high-frequency information redundancy from the peripheral regions and still reconstruct a perceptually good quality image. Great success has been obtained previously by a class of embedded wavelet image coding algorithms, such as the embedded zerotree wavelet (EZW) and the set partitioning in hierarchical trees (SPIHT) algorithms. Embedded wavelet coding not only provides very good compression performance, but also has the property that the bitstream can be truncated at any point and still be decoded to recreate a reasonably good quality image. In this paper, we propose an embedded foveation image coding (EFIC) algorithm, which orders the encoded bitstream to optimize foveated visual quality at arbitrary bit-rates. A foveation-based image quality metric, namely, foveated wavelet image quality index (FWQI), plays an important role in the EFIC system. We also developed a modified SPIHT algorithm to improve the coding efficiency. Experiments show that EFIC integrates foveation filtering with foveated image coding and demonstrates very good coding performance and scalability in terms of foveated image quality measurement.

221 citations


Journal ArticleDOI
TL;DR: A new optimal rate control algorithm for maximizing the FSNR is established using a Lagrange multiplier method defined on a curvilinear coordinate system and a piecewise R-D (rate-distortion)/R-Q ( rate-quantization) model is developed.
Abstract: Previously, fovcated video compression algorithms have been proposed which, in certain applications, deliver high-quality video at reduced bit rates by seeking to match the nonuniform sampling of the human retina. We describe such a framework here where foveated video is created by a nonuniform filtering scheme that increases the compressibility of the video stream. We maximize a new foveal visual quality metric. the foveal signal-to-noise ratio (FSNR) to determine the best compression and rate control parameters for a given target bit rate. Specifically, we establish a new optimal rate control algorithm for maximizing the FSNR using a Lagrange multiplier method defined on a curvilinear coordinate system. For optimal rate control, we also develop a piecewise R-D (rate-distortion)/R-Q (rate-quantization) model. A fast algorithm for searching for an optimal Lagrange multiplier /spl lambda/* is subsequently presented. For the new models, we show how the reconstructed video quality is affected, where the FSNR is maximized, and demonstrate the coding performance for H.263,+,++/MPEG-4 video coding. For H.263/MPEG video coding, a suboptimal rate control algorithm is developed for fast, high-performance applications. In the simulations, we compare the reconstructed pictures obtained using optimal rate control methods for foveated and normal video. We show that foveated video coding using the suboptimal rate control algorithm delivers excellent performance under 64 kb/s.

155 citations


Proceedings ArticleDOI
07 May 2001
TL;DR: A method for DCT-domain blind measurement of blocking artifacts by constituting a new block across any two adjacent blocks, the blocking artifact is modeled as a 2-D step function.
Abstract: A method for DCT-domain blind measurement of blocking artifacts is proposed. By constituting a new block across any two adjacent blocks, the blocking artifact is modeled as a 2-D step function. A fast DCT-domain algorithm has been derived to constitute the new block and extract all parameters needed. Then an human visual system (HVS) based measurement of blocking artifacts is conducted. Experimental results have shown the effectiveness and stability of our method. The proposed technique can be used for online image/video quality monitoring and control in applications of DCT-domain image/video processing.

112 citations


Journal ArticleDOI
TL;DR: This work uses an AM-FM representation for each fingerprint to obtain significant gains in classification performance as compared to the commonly used National Institute of Standards system, for the same classifier.
Abstract: Research on fingerprint classification has primarily focused on finding improved classifiers, image and feature enhancement, and less on the development of novel fingerprint representations. Using an AM-FM representation for each fingerprint, we obtain significant gains in classification performance as compared to the commonly used National Institute of Standards system, for the same classifier.

62 citations


Proceedings ArticleDOI
07 Dec 2001
TL;DR: This work developed a new image quality metric called foveated wavelet image quality index (FWQI) in the wavelet transform domain and shows its effectiveness by using it as a guide for optimal bit assignment of an embedded foveate image coding system.
Abstract: The human visual system (HVS) is highly non-uniform in sampling, coding, processing and understanding The spatial resolution of the HVS is highest around the point of fixation (foveation point) and decreases rapidly with increasing eccentricity Currently, most image quality measurement methods are designed for uniform resolution images These methods do not correlate well with the perceived foveated image quality Wavelet analysis delivers a convenient way to simultaneously examine localized spatial as well as frequency information We developed a new image quality metric called foveated wavelet image quality index (FWQI) in the wavelet transform domain FWQI considers multiple factors of the HVS, including the spatial variance of the contrast sensitivity function, the spatial variance of the local visual cut-off frequency, the variance of human visual sensitivity in different wavelet subbands, and the influence of the viewing distance on the display resolution and the HVS features FWQI can be employed for foveated region of interest (ROI) image coding and quality enhancement We show its effectiveness by using it as a guide for optimal bit assignment of an embedded foveated image coding system The coding system demonstrates very good coding performance and scalability in terms of foveated objective as well as subjective quality measurement© (2001) COPYRIGHT SPIE--The International Society for Optical Engineering Downloading of the abstract is permitted for personal use only

54 citations


Journal ArticleDOI
TL;DR: The nullspace-based approach can be formulated as an optimization problem and it is shown that this formulation implies a new subspace- based approach that uses matrix operations that has the same advantages as the null space-based one but requires less computational complexity.
Abstract: Existing eigenstructure-based direct multichannel blind image restoration techniques include nullspace-based and direct deconvolver estimation techniques. The nullspace-based approach can be formulated as an optimization problem. We show that this formulation implies a new subspace-based approach that uses matrix operations. This new approach has the same advantages as the nullspace-based one but requires less computational complexity. Under some mild conditions, its complexity is equal to that of the FFT. Furthermore, the relation among the nullspace-based approach, the direct deconvolver estimation and the new subspace-based approach is studied.

45 citations


Proceedings ArticleDOI
07 Oct 2001
TL;DR: An objective quality metric for ROI coded images in the wavelet transform domain is developed and shows its effectiveness by applying it to an embedded foveated image coding system.
Abstract: Region of interest (ROI) image and video compression techniques have been widely used in visual communication applications in an effort to deliver good quality images and videos at limited bandwidths. Most image quality metrics have been developed for uniform resolution images. These metrics are not appropriate for the assessment of ROI coded images, where space-variant resolution is necessary. The spatial resolution of the human visual system (HVS) is highest around the point of fixation and decreases rapidly with increasing eccentricity. Since the ROIs are usually the regions "fixated" by human eyes, the foveation property of the HVS supplies a natural approach for guiding the design of ROI image quality measurement algorithms. We have developed an objective quality metric for ROI coded images in the wavelet transform domain. This metric can serve to mediate the compression and enhancement of ROI coded images and videos. We show its effectiveness by applying it to an embedded foveated image coding system.

37 citations


Proceedings ArticleDOI
07 Oct 2001
TL;DR: A segmentation algorithm is proposed for M-FISH images that minimizes the entropy of classified pixels within possible chromosomes and is shown to correctly decompose even difficult clusters of touching and overlapping chromosomes.
Abstract: In the early 1990s, the state-of-the-art in commercial chromosome image acquisition was grayscale Automated chromosome classification was based on the grayscale image and boundary information obtained during segmentation Multi-spectral image acquisition was developed in 1990 and commercialized in the mid-1990s One acquisition method, multiplex fluorescence in-situ hybridization (M-FISH), uses five color dyes We propose a segmentation algorithm for M-FISH images that minimizes the entropy of classified pixels within possible chromosomes This method is shown to correctly decompose even difficult clusters of touching and overlapping chromosomes Finally, an example image is given to illustrate the algorithm

28 citations


Proceedings ArticleDOI
07 May 2001
TL;DR: A novel local bandwidth constrained fast inverse motion compensation algorithm operating in the DCT-domain is proposed that achieves computational improvement of 25% to 55% without visual degradation and a reduction of blocking artifacts in very low bit-rate compressed video sequences.
Abstract: DCT-based digital video coding standards such as MPEG and H26x are becoming more widely adopted for multimedia applications Since the standards differ in their format and syntax, video transcoding, where a pre-coded video bit-stream is converted from one format to another format, is of interest for purposes such as channel bandwidth adaptation and video composition DCT-domain video transcoding is generally more efficient than spatial domain transcoding However, since the data is organized block by block in the DCT-domain, inverse motion compensation becomes the bottleneck for DCT-domain methods We propose a novel local bandwidth constrained fast inverse motion compensation algorithm operating in the DCT-domain Relative to Chang's (1995)algorithm, the proposed algorithm achieves computational improvement of 25% to 55% without visual degradation A by-product of the proposed algorithm is a reduction of blocking artifacts in very low bit-rate compressed video sequences

21 citations


Proceedings ArticleDOI
07 May 2001
TL;DR: A fast approximation of the foveation model is developed, which is a non-uniform resolution representation of an image reflecting the sampling in the retina, for low bit-rate video coding and is incorporated into the baseline H.263 video encoding standard.
Abstract: Video coding techniques employ characteristics of the human visual system (HVS) to achieve high coding efficiency. Lee (2000) and Bovik have exploited foveation, which is a non-uniform resolution representation of an image reflecting the sampling in the retina, for low bit-rate video coding. We develop a fast approximation of the foveation model and demonstrate real-time foveation techniques in the spatial domain and discrete cosine transform (DCT) domain. We incorporate fast DCT domain foveation into the baseline H.263 video encoding standard. We show that DCT-domain foveation requires much lower computational overhead but generates higher bit rates than spatial domain foveation. Our techniques do not require any modifications of the decoder.

21 citations


Journal ArticleDOI
TL;DR: The proposed signal-adaptive FM transform produces point spectra for multidimensional signals with uniformly distributed samples, suggesting that the proposed transform is suitable for energy compaction and subsequent coding of broadband signals and images that locally exhibit significant level diversity.
Abstract: The present a novel class of multidimensional orthogonal FM transforms. The analysis suggests a novel signal-adaptive FM transform possessing interesting energy compaction properties. We show that the proposed signal-adaptive FM transform produces point spectra for multidimensional signals with uniformly distributed samples. This suggests that the proposed transform is suitable for energy compaction and subsequent coding of broadband signals and images that locally exhibit significant level diversity. We illustrate these concepts with simulation experiments.

Proceedings ArticleDOI
07 May 2001
TL;DR: This work proposes a foveation scalable video coding (FSVC) algorithm, which supplies good quality-compression performance as well as effective rate scalability to support simple and precise bit rate control.
Abstract: Recently, there have been two interesting trends in image and video coding research. One is to use human visual system (HVS) models to improve the current state-of-the-art coding algorithms by better exploiting the properties of the intended receiver. The other is to design rate-scalable video codecs, which allow the extraction of coded visual information at continuously varying bit rates from a single compressed bitstream. We follow these two trends and propose a foveation scalable video coding (FSVC) algorithm, which supplies good quality-compression performance as well as effective rate scalability to support simple and precise bit rate control. A foveation-based HVS model plays a key role in the algorithm. The algorithm is amenable to the inclusion of various HVS models and adaptable to different video communication applications.

Proceedings ArticleDOI
01 Jan 2001
TL;DR: An adaptive frame prediction scheme for foveation scalable video coding (FSVC), which is a new video coding algorithm that combines a foveations-based human visual system (HVS) model with a wavelet-based rate scalable coding algorithm.
Abstract: Embedded rate scalable video coding allows for the extraction of coded visual information at continuously varying bit rates from a single compressed bitstream. This is a very attractive feature for many multimedia communication applications. Motion estimation (ME) /motion compensation (MC) techniques are widely employed in various video coding systems to reduce temporal information redundancy. One of the major challenging problems in ME/MC based rate scalable video coding is how to generate the prediction frame from the previous frame to match the current frame. This problem is more difficult in rate scalable coding than in fixed rate coding because the decoding data rate is unavailable to the encoder. We propose an adaptive frame prediction scheme for foveation scalable video coding (FSVC), which is a new video coding algorithm that combines a foveation-based human visual system (HVS) model with a wavelet-based rate scalable coding algorithm. The new frame prediction algorithm provides an adaptive mechanism to control the prediction errors while reduce error propagation.

Proceedings ArticleDOI
07 Oct 2001
TL;DR: A look-up-table (LUT) based method for DCT domain inverse motion compensation is proposed by modeling the statistical distribution of the DCT coefficients in typical images and video sequences to save more than 50% of the computing time based on experimental results.
Abstract: DCT-based digital video coding standards such as MPEG and H.26x have been widely adopted for multimedia applications. Thus video processing in the DCT domain usually proves to be more efficient than in the spatial domain. To directly convert an inter-coded frame into an intra-coded frame in the DCT domain, the problem of DCT domain inverse motion compensation was studied by Chang and Messerschmitt(1995). Since the data is organized block by block in the DCT domain, the DCT domain inverse motion compensation is computationally intensive. In this paper, a look-up-table (LUT) based method for DCT domain inverse motion compensation is proposed by modeling the statistical distribution of the DCT coefficients in typical images and video sequences. Compared to the method of Chang et al., the LUT based method can save more than 50% of the computing time based on experimental results. The memory requirement of the LUT is about 800 KB which is reasonable. Moreover, the LUT can be shared by multiple DCT domain video processing applications running on the same computer.