scispace - formally typeset
Search or ask a question
Author

Alan C. Bovik

Bio: Alan C. Bovik is an academic researcher from University of Texas at Austin. The author has contributed to research in topics: Image quality & Video quality. The author has an hindex of 102, co-authored 837 publications receiving 96088 citations. Previous affiliations of Alan C. Bovik include University of Illinois at Urbana–Champaign & University of Sydney.


Papers
More filters
Journal ArticleDOI
TL;DR: Two studies that were aimed towards increasing the understanding of how the visibility of distortions on stereoscopically viewed 3D images is affected by scene content and distortion types are described.
Abstract: We describe two studies that were aimed towards increasing our understanding of how the visibility of distortions on stereoscopically viewed 3D images is affected by scene content and distortion types. By assuming that subjects' performance would be highly correlated with the visibility of local distorted patches, we analyzed subjects' performance in locating distortion patches when viewing stereoscopic 3D images. Subjects' performances are measured by whether they successfully locate a local distorted patch, the times they spent to finish the task, and subjective quality ratings given by subjects. The visual data used in this work are co-registered stereo images with co-registered “ground truth” range (depth) data. Varied statistical analysis methods were used to discuss the significance of our observations. Three observations are drawn from our analyses. First, blur, JPEG, and JP2K distortions in stereo 3D images may be suppressed if one of the left or right views is undistorted. Second, contrast maskinga does not occur, or is reduced, while viewing white noise distorted stereo 3D images. Third, there is no depth/disparity masking effect when viewing stereo 3D images, but there may be (conversely) depth-related facilitationb effects for blur, JPEG, and JP2K distorted stereo 3D images.

7 citations

Journal ArticleDOI
TL;DR: A database of typical “billboard” and “thumbnail” images viewed on mobile streaming applications is created and the effects of compression, scaling and chroma-subsampling on perceived quality are studied by conducting a subjective study.
Abstract: With the growing use of smart cellular devices for entertainment purposes, audio and video streaming services now offer an increasingly wide variety of popular mobile applications that offer portable and accessible ways to consume content. The user interfaces of these applications have become increasingly visual in nature, and are commonly loaded with dense multimedia content such as thumbnail images, animated GIFs, and short videos. To efficiently render these and to aid rapid download to the client display, it is necessary to compress, scale and color subsample them. These operations introduce distortions, reducing the appeal of the application. It is desirable to be able to automatically monitor and govern the visual qualities of these small images, which are usually small images. However, while there exists a variety of high-performing image quality assessment (IQA) algorithms, none have been designed for this particular use case. This kind of content often has unique characteristics, such as overlaid graphics, intentional brightness, gradients, text, and warping. We describe a study we conducted on the subjective and objective quality of images embedded in the displayed user interfaces of mobile streaming applications. We created a database of typical “billboard” and “thumbnail” images viewed on such services. Using the collected data, we studied the effects of compression, scaling and chroma-subsampling on perceived quality by conducting a subjective study. We also evaluated the performance of leading picture quality prediction models on the new database. We report some surprising results regarding algorithm performance, and find that there remains ample scope for future model development.

7 citations

Proceedings ArticleDOI
06 Mar 2016
TL;DR: A full reference video quality assessment (VQA) framework that incorporates flicker sensitive temporal visual masking and predicts perceptually silenced flicker visibility using a model of the responses of primary visual cortex to video flicker, a motion energy model, and divisive normalization.
Abstract: From a series of human subjective studies, we have found that large motion can strongly suppress flicker visibility. Based on the spectral analysis of flicker videos in frequency domain, we propose a full reference video quality assessment (VQA) framework that incorporates flicker sensitive temporal visual masking. The framework predicts perceptually silenced flicker visibility using a model of the responses of primary visual cortex to video flicker, a motion energy model, and divisive normalization. By incorporating perceptual flicker visibility into motion tuned video quality measurements as in the MOVIE framework, we augment VQA performance with sensitivity to flicker. Results show that the proposed VQA framework correlates well with human results and is highly competitive with recent state-of-the-art VQA algorithms tested on the LIVE VQA database.

7 citations

Proceedings ArticleDOI
02 Apr 2000
TL;DR: This work develops unique algorithms for assessing the quality of foveated image/video data and analyzes the increase in compression efficiency that is afforded by the foveation approach.
Abstract: We present a framework for assessing the quality of, and determining the efficiency of foveated and compressed images and video streams. We develop unique algorithms for assessing the quality of foveated image/video data. By interpreting foveation as a coordinate transformation, we analyze the increase in compression efficiency that is afforded by our foveation approach. We demonstrate these concepts on foveated, compressed video streams using modified (foveated) versions of H.263 that are standards-compliant. In the simulations, quality versus compression is enhanced considerably by the foveation approach. We obtain compression gains ranging from 8% to 52% for I pictures and from 7% to 68% for P pictures.

7 citations

Proceedings ArticleDOI
14 May 2006
TL;DR: It is shown that unmodulated versions of these filters can be used to detect the central mass region of spiculated masses in mammography using toroidal Gaussian filters.
Abstract: We have invented a new class of linear filters for the detection of spiculated masses and architectural distortions in mammography. We call these Spiculation Filters. These filters are narrow band filters and form a new class of wavelet-type filter banks. In this paper, we show that unmodulated versions of these filters can be used to detect the central mass region of spiculated masses. We refer to these as toroidal gaussian filters. We also show that the physical properties of spiculated masses can be extracted from the responses of the toroidal gaussian filters without segmentation.

7 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

40,609 citations

Book
01 Jan 1998
TL;DR: An introduction to a Transient World and an Approximation Tour of Wavelet Packet and Local Cosine Bases.
Abstract: Introduction to a Transient World. Fourier Kingdom. Discrete Revolution. Time Meets Frequency. Frames. Wavelet Zoom. Wavelet Bases. Wavelet Packet and Local Cosine Bases. An Approximation Tour. Estimations are Approximations. Transform Coding. Appendix A: Mathematical Complements. Appendix B: Software Toolboxes.

17,693 citations

Proceedings ArticleDOI
21 Jul 2017
TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

11,958 citations

Posted Content
TL;DR: Conditional Adversarial Network (CA) as discussed by the authors is a general-purpose solution to image-to-image translation problems, which can be used to synthesize photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

11,127 citations

Journal ArticleDOI
01 Apr 1988-Nature
TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.
Abstract: Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

9,929 citations