scispace - formally typeset
Search or ask a question
Author

Alan C. Bovik

Bio: Alan C. Bovik is an academic researcher from University of Texas at Austin. The author has contributed to research in topics: Image quality & Video quality. The author has an hindex of 102, co-authored 837 publications receiving 96088 citations. Previous affiliations of Alan C. Bovik include University of Illinois at Urbana–Champaign & University of Sydney.


Papers
More filters
Proceedings ArticleDOI
23 May 2010
TL;DR: A DCT (Discrete Cosine Transform)-based bit allocation scheme to maximize image quality for a given bit rate constraint and two major featured algorithms are proposed : “SLO” and “SPP” (Searching the Present optimal bit allocation from the Previous optimalbit allocation).
Abstract: Recently, VSNs (Visual Sensor Networks) are becoming of increased interest in a variety of applications. Due to power constraints at each visual sensor node, the processing of live video is constrained. In this paper, we propose a DCT (Discrete Cosine Transform)-based bit allocation scheme to maximize image quality for a given bit rate constraint. Two major featured algorithms are proposed : “SLO” (Searching from the Lowest Order) and “SPP” (Searching the Present optimal bit allocation from the Previous optimal bit allocation). The proposed algorithms reduce the search range based on a simple IQA (Image Quality Assessment) model and DCT statistics. Thus, they demonstrate low-complexity and fast computation. In the simulations, we show the superiority of the proposed algorithms over conventional approaches.
01 Jan 1991
TL;DR: In this article, the use of chromatic photometric constraints is used to specify a mathematical optimality criterion for solving the dense stereo correspondence problem, and the results demonstrate that using chromatic information can significantly improve the performance of stereo correspondence.
Abstract: We investigate the use of chromatic information in dense stereo correspondence. Specifically, the chromatic photometric constraint, which is used to specify a mathematical optimality criterion for solving the dense stereo correspondence problem, is developed. The result is a theoretical construction for developing dense stereo correspondence algorithms which use chromatic information. The efficacy of using chromatic information via this construction is tested by implementing singleand multi-resolution versions of a stereo correspondence algorithm which uses simulated annealing as a means of solving the optimization problem. Results demonstrate that the use of chromatic information can significantly improve the performance of dense stereo correspondence.
Posted Content
TL;DR: This paper and talk will discuss the ups and downs of trying to deal with extremely rapid technological changes, and how I reacted to, and dealt with consequent dramatic changes in the relevance of the topics I’ve taught for three decades.
Abstract: In this rather informal paper and talk I will discuss my own experiences, feelings, and evolution as an Image Processing and Digital Video educator trying to navigate the Deep Learning revolution. I will discuss my own ups and downs of trying to deal with extremely rapid technological changes, and how I have reacted to, and dealt with consequent dramatic changes in the relevance of the topics I've taught for three decades. I have arranged the discussion in terms of the stages, over time, of my progression dealing with these sea changes.
06 Apr 2023
TL;DR: In this article , the authors propose a novel Fusion of Unified Quality Evaluators (FUNQUE) framework, by enabling computation sharing and by using a transform that is sensitive to visual perception to boost accuracy.
Abstract: The Visual Multimethod Assessment Fusion (VMAF) algorithm has recently emerged as a state-of-the-art approach to video quality prediction, that now pervades the streaming and social media industry. However, since VMAF requires the evaluation of a heterogeneous set of quality models, it is computationally expensive. Given other advances in hardware-accelerated encoding, quality assessment is emerging as a significant bottleneck in video compression pipelines. Towards alleviating this burden, we propose a novel Fusion of Unified Quality Evaluators (FUNQUE) framework, by enabling computation sharing and by using a transform that is sensitive to visual perception to boost accuracy. Further, we expand the FUNQUE framework to define a collection of improved low-complexity fused-feature models that advance the state-of-the-art of video quality performance with respect to both accuracy and computational efficiency.
Book ChapterDOI
04 Oct 2010
TL;DR: In this article, the structural similarity index (SSIM) was used for image quality assessment for MPEG-4 compressed videos. But the performance of the algorithm was evaluated on the Video Quality Experts Group (VQEG) Phase-I dataset.
Abstract: The recent MPEG‐4 Part 10/H.264 AVC standard enables improved compression rates without compromising on visual quality. ‘Visual quality’ here refers to the quality of a video as perceived by a human observer. It is this aspect of visual quality that we focus upon in this chapter. It is widely agreed that the most commonly used mean squared error (MSE) correlates poorly with the human perception of quality. MSE is a full reference (FR) video quality assessment algorithm (VQA). FR VQA algorithms are those that require both the original as well as the distorted videos in order to predict the perceived quality of the video. Recently proposed FR VQA algorithms have been shown to correlate well with human perception of quality. However, a practically implementable solution remains evasive. In this chapter, we detail one possible approach to real‐time quality assessment, developed specifically for MPEG‐4 compressed videos. This algorithm leverages the computational simplicity of the structural similarity index (SSIM) for image quality assessment (IQA), and incorporates motion information embedded in the compressed motion vectors from the H.264 compressed stream to evaluate visual quality. We detail the algorithm and demonstrate its performance on the popular Video Quality Experts Group (VQEG) FRTV Phase – I dataset. Further, we describe a subjective study that we have undertaken specifically for H.264 compressed videos. We compare the performance of leading VQA algorithms on this database and make the database available free‐of‐cost for researchers in order to further the field of VQA for MPEG‐4 compressed videos.

Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

40,609 citations

Book
01 Jan 1998
TL;DR: An introduction to a Transient World and an Approximation Tour of Wavelet Packet and Local Cosine Bases.
Abstract: Introduction to a Transient World. Fourier Kingdom. Discrete Revolution. Time Meets Frequency. Frames. Wavelet Zoom. Wavelet Bases. Wavelet Packet and Local Cosine Bases. An Approximation Tour. Estimations are Approximations. Transform Coding. Appendix A: Mathematical Complements. Appendix B: Software Toolboxes.

17,693 citations

Proceedings ArticleDOI
21 Jul 2017
TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

11,958 citations

Posted Content
TL;DR: Conditional Adversarial Network (CA) as discussed by the authors is a general-purpose solution to image-to-image translation problems, which can be used to synthesize photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

11,127 citations

Journal ArticleDOI
01 Apr 1988-Nature
TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.
Abstract: Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

9,929 citations