scispace - formally typeset
Search or ask a question
Author

Alan C. Bovik

Bio: Alan C. Bovik is an academic researcher from University of Texas at Austin. The author has contributed to research in topics: Image quality & Video quality. The author has an hindex of 102, co-authored 837 publications receiving 96088 citations. Previous affiliations of Alan C. Bovik include University of Illinois at Urbana–Champaign & University of Sydney.


Papers
More filters
Journal ArticleDOI
TL;DR: A new model for no-reference 3D stereopair quality assessment that considers the impact of binocular fusion, rivalry, suppression, and a reverse saliency effect on the perception of distortion, and is thoroughly evaluated on the LIVE 3D image quality database.
Abstract: We develop a new model for no-reference 3D stereopair quality assessment that considers the impact of binocular fusion, rivalry, suppression, and a reverse saliency effect on the perception of distortion. The resulting framework, dubbed the S3D INtegrated Quality (SINQ) Predictor, first fuses the left and right views of a stereopair into a single synthesized cyclopean image using a novel modification of an existing binocular perceptual model. Specifically, the left and right views of a stereopair are fused using a measure of “cyclopean” spatial activity. A simple product estimate is also calculated as the correlation between left and right disparity-corrected corresponding binocular pixels. Univariate and bivariate statistical features are extracted from the four available image sources: the left view, the right view, the synthesized “cyclopean” spatial activity image, and the binocular product image. Based on recent evidence regarding the placement of 3D fixation by subjects viewing stereoscopic 3D (S3D) content, we also deploy a reverse saliency weighting on the normalized “cyclopean” spatial activity image. Both one- and two-stage frameworks are then used to map the feature vectors to predicted quality scores. SINQ is thoroughly evaluated on the LIVE 3D image quality database (Phase I and Phase II). The experimental results show that SINQ delivers better performance than state of the art 2D and 3D quality assessment methods on six public databases, especially on asymmetric distortions.

67 citations

Journal ArticleDOI
TL;DR: This work shows that the new video QA algorithms are highly responsive to packet loss errors, and proposes a general framework for constructing temporal video quality assessment (QA) algorithms that seek to assess transient temporal errors, such as packet losses.
Abstract: We examine the effect that variations in the temporal quality of videos have on global video quality. We also propose a general framework for constructing temporal video quality assessment (QA) algorithms that seek to assess transient temporal errors, such as packet losses. The proposed framework modifies simple frame-based quality assessment algorithms by incorporating a temporal quality variance factor. We use packet loss from channel errors as a specific study of practical significance. Using the PSNR and the SSIM index as exemplars, we are able to show that the new video QA algorithms are highly responsive to packet loss errors.

67 citations

Journal ArticleDOI
TL;DR: A new content-weighted method for full- reference (FR) video quality assessment using a three-component image model that classifies image local regions according to their image gradient properties and applies variable weights to structural similarity image index (SSIM) and peak signal-to-noise ratio (PSNR) scores.
Abstract: Objective image and video quality measures play impor- tant roles in numerous image and video processing applications. In this work, we propose a new content-weighted method for full- reference (FR) video quality assessment using a three-component image model. Using the idea that different image regions have dif- ferent perceptual significance relative to quality, we deploy a model that classifies image local regions according to their image gradient properties, then apply variable weights to structural similarity image index (SSIM) (and peak signal-to-noise ratio (PSNR)) scores ac- cording to region. A frame-based video quality assessment algo- rithm is thereby derived. Experimental results on the Video Quality Experts Group (VQEG) FR-TV Phase 1 test dataset show that the proposed algorithm outperforms existing video quality assessment methods. © 2010 SPIE and IS&T. DOI: 10.1117/1.3267087

66 citations

Journal ArticleDOI
TL;DR: The 3D-AVM Predictor accounts for anomalous motor responses of both accommodation and vergence, yielding predictive power that is statistically superior to prior models that rely on a computed disparity distribution only.
Abstract: To achieve clear binocular vision, neural processes that accomplish accommodation and vergence are performed via two collaborative, cross-coupled processes: accommodation-vergence (AV) and vergence-accommodation (VA). However, when people watch stereo images on stereoscopic displays, normal neural functioning may be disturbed owing to anomalies of the cross-link gains. These anomalies are likely the main cause of visual discomfort experienced when viewing stereo images, and are called Accommodation-Vergence Mismatches (AVM). Moreover, the absence of any useful accommodation depth cues when viewing 3D content on a flat panel (planar) display induces anomalous demands on binocular fusion, resulting in possible additional visual discomfort. Most prior efforts in this direction have focused on predicting anomalies in the AV cross-link using measurements on a computed disparity map. We further these contributions by developing a model that accounts for both accommodation and vergence, resulting in a new visual discomfort prediction algorithm dubbed the 3D-AVM Predictor. The 3D-AVM model and algorithm make use of a new concept we call local 3D bandwidth (BW) which is defined in terms of the physiological optics of binocular vision and foveation. The 3D-AVM Predictor accounts for anomalous motor responses of both accommodation and vergence, yielding predictive power that is statistically superior to prior models that rely on a computed disparity distribution only.

65 citations

Journal ArticleDOI
TL;DR: A variety of recurrent dynamic neural networks are proposed that conduct continuous-time subjective QoE prediction on video streams impaired by both compression artifacts and rebuffering events, and ways of aggregating different models into a forecasting ensemble that delivers improved results with reduced forecasting variance are evaluated.
Abstract: Streaming video services represent a very large fraction of global bandwidth consumption. Due to the exploding demands of mobile video streaming services, coupled with limited bandwidth availability, video streams are often transmitted through unreliable, low-bandwidth networks. This unavoidably leads to two types of major streaming-related impairments: compression artifacts and/or rebuffering events. In streaming video applications, the end-user is a human observer; hence being able to predict the subjective Quality of Experience (QoE) associated with streamed videos could lead to the creation of perceptually optimized resource allocation strategies driving higher quality video streaming services. We propose a variety of recurrent dynamic neural networks that conduct continuous-time subjective QoE prediction. By formulating the problem as one of time-series forecasting, we train a variety of recurrent neural networks and non-linear autoregressive models to predict QoE using several recently developed subjective QoE databases. These models combine multiple, diverse neural network inputs, such as predicted video quality scores, rebuffering measurements, and data related to memory and its effects on human behavioral responses, using them to predict QoE on video streams impaired by both compression artifacts and rebuffering events. Instead of finding a single time-series prediction model, we propose and evaluate ways of aggregating different models into a forecasting ensemble that delivers improved results with reduced forecasting variance. We also deploy appropriate new evaluation metrics for comparing time-series predictions in streaming applications. Our experimental results demonstrate improved prediction performance that approaches human performance. An implementation of this work can be found at https://github.com/christosbampis/NARX_QoE_release .

63 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

40,609 citations

Book
01 Jan 1998
TL;DR: An introduction to a Transient World and an Approximation Tour of Wavelet Packet and Local Cosine Bases.
Abstract: Introduction to a Transient World. Fourier Kingdom. Discrete Revolution. Time Meets Frequency. Frames. Wavelet Zoom. Wavelet Bases. Wavelet Packet and Local Cosine Bases. An Approximation Tour. Estimations are Approximations. Transform Coding. Appendix A: Mathematical Complements. Appendix B: Software Toolboxes.

17,693 citations

Proceedings ArticleDOI
21 Jul 2017
TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

11,958 citations

Posted Content
TL;DR: Conditional Adversarial Network (CA) as discussed by the authors is a general-purpose solution to image-to-image translation problems, which can be used to synthesize photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

11,127 citations

Journal ArticleDOI
01 Apr 1988-Nature
TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.
Abstract: Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

9,929 citations