scispace - formally typeset
Search or ask a question
Author

Alan C. Bovik

Bio: Alan C. Bovik is an academic researcher from University of Texas at Austin. The author has contributed to research in topics: Image quality & Video quality. The author has an hindex of 102, co-authored 837 publications receiving 96088 citations. Previous affiliations of Alan C. Bovik include University of Illinois at Urbana–Champaign & University of Sydney.


Papers
More filters
Journal ArticleDOI
TL;DR: A new model, called Oriented Gradients Image Quality Assessment (OG-IQA), is shown to deliver highly competitive image quality prediction performance as compared with the most popular IQA approaches and has a relatively low time complexity.
Abstract: The image gradient is a commonly computed image feature and a potentially predictive factor for image quality assessment (IQA). Indeed, it has been successfully used for both full- and no- reference image quality prediction. However, the gradient orientation has not been deeply explored as a predictive source of information for image quality assessment. Here we seek to amend this by studying the quality relevance of the relative gradient orientation, viz., the gradient orientation relative to the surround. We also deploy a relative gradient magnitude feature which accounts for perceptual masking and utilize an AdaBoosting back-propagation (BP) neural network to map the image features to image quality. The generalization of the AdaBoosting BP neural network results in an effective and robust quality prediction model. The new model, called Oriented Gradients Image Quality Assessment (OG-IQA), is shown to deliver highly competitive image quality prediction performance as compared with the most popular IQA approaches. Furthermore, we show that OG-IQA has good database independence properties and a low complexity. OG-IQA extracts a 6-dimensional relative gradient feature vector from the inputs.OG-IQA utilizes an AdaBoosting BP neural network to map the image features to image quality.OG-IQA delivers highly competitive image quality prediction performance and has a relatively low time complexity.

221 citations

Proceedings ArticleDOI
TL;DR: The recent MOtion-based Video Integrity Evaluation (MOVIE) index emerges as the leading objective VQA algorithm in this study, while the performance of the Video Quality Metric (VQM) and the Multi-Scale Structural SIMilarity (MS-SSIM) index is noteworthy.
Abstract: Automatic methods to evaluate the perceptual quality of a digital video sequence have widespread applications wherever the end-user is a human. Several objective video quality assessment (VQA) algorithms exist, whose performance is typically evaluated using the results of a subjective study performed by the video quality experts group (VQEG) in 2000. There is a great need for a free, publicly available subjective study of video quality that embodies state-of-the-art in video processing technology and that is effective in challenging and benchmarking objective VQA algorithms. In this paper, we present a study and a resulting database, known as the LIVE Video Quality Database, where 150 distorted video sequences obtained from 10 different source video content were subjectively evaluated by 38 human observers. Our study includes videos that have been compressed by MPEG-2 and H.264, as well as videos obtained by simulated transmission of H.264 compressed streams through error prone IP and wireless networks. The subjective evaluation was performed using a single stimulus paradigm with hidden reference removal, where the observers were asked to provide their opinion of video quality on a continuous scale. We also present the performance of several freely available objective, full reference (FR) VQA algorithms on the LIVE Video Quality Database. The recent MOtion-based Video Integrity Evaluation (MOVIE) index emerges as the leading objective VQA algorithm in our study, while the performance of the Video Quality Metric (VQM) and the Multi-Scale Structural SIMilarity (MS-SSIM) index is noteworthy. The LIVE Video Quality Database is freely available for download1 and we hope that our study provides researchers with a valuable tool to benchmark and improve the performance of objective VQA algorithms.

215 citations

Journal ArticleDOI
TL;DR: A novel blind/no-reference (NR) model for accessing the perceptual quality of screen content pictures with big data learning and delivers computational efficiency and promising performance.
Abstract: Recent years have witnessed a growing number of image and video centric applications on mobile, vehicular, and cloud platforms, involving a wide variety of digital screen content images Unlike natural scene images captured with modern high fidelity cameras, screen content images are typically composed of fewer colors, simpler shapes, and a larger frequency of thin lines In this paper, we develop a novel blind/no-reference (NR) model for accessing the perceptual quality of screen content pictures with big data learning The new model extracts four types of features descriptive of the picture complexity, of screen content statistics, of global brightness quality, and of the sharpness of details Comparative experiments verify the efficacy of the new model as compared with existing relevant blind picture quality assessment algorithms applied on screen content image databases A regression module is trained on a considerable number of training samples labeled with objective visual quality predictions delivered by a high-performance full-reference method designed for screen content image quality assessment (IQA) This results in an opinion-unaware NR blind screen content IQA algorithm Our proposed model delivers computational efficiency and promising performance The source code of the new model will be available at: https://sitesgooglecom/site/guke198701/publications

213 citations

Journal ArticleDOI
TL;DR: New border features are proposed, which are able to effectively characterize border irregularities on both complete lesions and incomplete lesions, and are designed that combines back propagation (BP) neural networks with fuzzy neural networks to achieve improved performance.
Abstract: We develop a novel method for classifying melanocytic tumors as benign or malignant by the analysis of digital dermoscopy images. The algorithm follows three steps: first, lesions are extracted using a self-generating neural network (SGNN); second, features descriptive of tumor color, texture and border are extracted; and third, lesion objects are classified using a classifier based on a neural network ensemble model. In clinical situations, lesions occur that are too large to be entirely contained within the dermoscopy image. To deal with this difficult presentation, new border features are proposed, which are able to effectively characterize border irregularities on both complete lesions and incomplete lesions. In our model, a network ensemble classifier is designed that combines back propagation (BP) neural networks with fuzzy neural networks to achieve improved performance. Experiments are carried out on two diverse dermoscopy databases that include images of both the xanthous and caucasian races. The results show that classification accuracy is greatly enhanced by the use of the new border features and the proposed classifier model.

212 citations

Journal ArticleDOI
TL;DR: A foveation scalable video coding (FSVC) algorithm which supplies good quality-compression performance as well as effective rate scalability, and is adaptable to different applications, such as knowledge-based video coding and video communications over time-varying, multiuser and interactive networks.
Abstract: Image and video coding is an optimization problem. A successful image and video coding algorithm delivers a good tradeoff between visual quality and other coding performance measures, such as compression, complexity, scalability, robustness, and security. In this paper, we follow two recent trends in image and video coding research. One is to incorporate human visual system (HVS) models to improve the current state-of-the-art of image and video coding algorithms by better exploiting the properties of the intended receiver. The other is to design rate scalable image and video codecs, which allow the extraction of coded visual information at continuously varying bit rates from a single compressed bitstream. Specifically, we propose a foveation scalable video coding (FSVC) algorithm which supplies good quality-compression performance as well as effective rate scalability. The key idea is to organize the encoded bitstream to provide the best decoded video at an arbitrary bit rate in terms of foveated visual quality measurement. A foveation-based HVS model plays an important role in the algorithm. The algorithm is adaptable to different applications, such as knowledge-based video coding and video communications over time-varying, multiuser and interactive networks.

212 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

40,609 citations

Book
01 Jan 1998
TL;DR: An introduction to a Transient World and an Approximation Tour of Wavelet Packet and Local Cosine Bases.
Abstract: Introduction to a Transient World. Fourier Kingdom. Discrete Revolution. Time Meets Frequency. Frames. Wavelet Zoom. Wavelet Bases. Wavelet Packet and Local Cosine Bases. An Approximation Tour. Estimations are Approximations. Transform Coding. Appendix A: Mathematical Complements. Appendix B: Software Toolboxes.

17,693 citations

Proceedings ArticleDOI
21 Jul 2017
TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

11,958 citations

Posted Content
TL;DR: Conditional Adversarial Network (CA) as discussed by the authors is a general-purpose solution to image-to-image translation problems, which can be used to synthesize photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

11,127 citations

Journal ArticleDOI
01 Apr 1988-Nature
TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.
Abstract: Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

9,929 citations