Showing papers by "Alan C. Bovik published in 2014"

PDF

Open Access

Journal Article•DOI•

Gradient Magnitude Similarity Deviation: A Highly Efficient Perceptual Image Quality Index

[...]

Wufeng Xue¹, Lei Zhang, Xuanqin Mou¹, Alan C. Bovik²•Institutions (2)

Xi'an Jiaotong University¹, University of Texas at Austin²

01 Feb 2014-IEEE Transactions on Image Processing

TL;DR: It is found that the pixel-wise gradient magnitude similarity (GMS) between the reference and distorted images combined with a novel pooling strategy-the standard deviation of the GMS map-can predict accurately perceptual image quality.

...read moreread less

Abstract: It is an important task to faithfully evaluate the perceptual quality of output images in many applications, such as image compression, image restoration, and multimedia streaming. A good image quality assessment (IQA) model should not only deliver high quality prediction accuracy, but also be computationally efficient. The efficiency of IQA metrics is becoming particularly important due to the increasing proliferation of high-volume visual data in high-speed networks. We present a new effective and efficient IQA model, called gradient magnitude similarity deviation (GMSD). The image gradients are sensitive to image distortions, while different local structures in a distorted image suffer different degrees of degradations. This motivates us to explore the use of global variation of gradient based local quality map for overall image quality prediction. We find that the pixel-wise gradient magnitude similarity (GMS) between the reference and distorted images combined with a novel pooling strategy-the standard deviation of the GMS map-can predict accurately perceptual image quality. The resulting GMSD algorithm is much faster than most state-of-the-art IQA methods, and delivers highly competitive prediction accuracy. MATLAB source code of GMSD can be downloaded at http://www4.comp.polyu.edu.hk/~cslzhang/IQA/GMSD/GMSD.htm.

...read moreread less

1,211 citations

Journal Article•DOI•

No-reference image quality assessment based on spatial and spectral entropies

[...]

Lixiong Liu¹, Bao Liu¹, Hua Huang¹, Alan C. Bovik²•Institutions (2)

Beijing Institute of Technology¹, University of Texas at Austin²

01 Sep 2014-Signal Processing-image Communication

TL;DR: It is found that SSEQ matches well with human subjective opinions of image quality, and is statistically superior to the full-reference IQA algorithm SSIM and several top-performing NR IQA methods: BIQI, DIIVINE, and BLIINDS-II.

...read moreread less

Abstract: We develop an efficient general-purpose no-reference (NR) image quality assessment (IQA) model that utilizes local spatial and spectral entropy features on distorted images. Using a 2-stage framework of distortion classification followed by quality assessment, we utilize a support vector machine (SVM) to train an image distortion and quality prediction engine. The resulting algorithm, dubbed Spatial–Spectral Entropy-based Quality (SSEQ) index, is capable of assessing the quality of a distorted image across multiple distortion categories. We explain the entropy features used and their relevance to perception and thoroughly evaluate the algorithm on the LIVE IQA database. We find that SSEQ matches well with human subjective opinions of image quality, and is statistically superior to the full-reference (FR) IQA algorithm SSIM and several top-performing NR IQA methods: BIQI, DIIVINE, and BLIINDS-II. SSEQ has a considerably low complexity. We also tested SSEQ on the TID2008 database to ascertain whether it has performance that is database independent.

...read moreread less

562 citations

Journal Article•DOI•

Blind image quality assessment using joint statistics of gradient magnitude and Laplacian features.

[...]

Wufeng Xue¹, Xuanqin Mou¹, Lei Zhang², Alan C. Bovik³, Xiangchu Feng⁴ - Show less +1 more•Institutions (4)

Xi'an Jiaotong University¹, Hong Kong Polytechnic University², University of Texas at Austin³, Xidian University⁴

08 Sep 2014-IEEE Transactions on Image Processing

TL;DR: This work proposes a novel BIQA model that utilizes the joint statistics of two types of commonly used local contrast features: 1) the gradient magnitude (GM) map and 2) the Laplacian of Gaussian response.

...read moreread less

Abstract: Blind image quality assessment (BIQA) aims to evaluate the perceptual quality of a distorted image without information regarding its reference image. Existing BIQA models usually predict the image quality by analyzing the image statistics in some transformed domain, e.g., in the discrete cosine transform domain or wavelet domain. Though great progress has been made in recent years, BIQA is still a very challenging task due to the lack of a reference image. Considering that image local contrast features convey important structural information that is closely related to image perceptual quality, we propose a novel BIQA model that utilizes the joint statistics of two types of commonly used local contrast features: 1) the gradient magnitude (GM) map and 2) the Laplacian of Gaussian (LOG) response. We employ an adaptive procedure to jointly normalize the GM and LOG features, and show that the joint statistics of normalized GM and LOG features have desirable properties for the BIQA task. The proposed model is extensively evaluated on three large-scale benchmark databases, and shown to deliver highly competitive performance with state-of-the-art BIQA models, as well as with some well-known full reference image quality assessment models.

...read moreread less

535 citations

Journal Article•DOI•

Blind Prediction of Natural Video Quality

[...]

Michele A. Saad¹, Alan C. Bovik¹, Christophe Charrier²•Institutions (2)

University of Texas at Austin¹, University of Caen Lower Normandy²

01 Mar 2014-IEEE Transactions on Image Processing

TL;DR: It is shown that the proposed NSS and motion coherency models are appropriate for quality assessment of videos, and they are utilized to design a blind VQA algorithm that correlates highly with human judgments of quality.

...read moreread less

Abstract: We propose a blind (no reference or NR) video quality evaluation model that is nondistortion specific. The approach relies on a spatio-temporal model of video scenes in the discrete cosine transform domain, and on a model that characterizes the type of motion occurring in the scenes, to predict video quality. We use the models to define video statistics and perceptual features that are the basis of a video quality assessment (VQA) algorithm that does not require the presence of a pristine video to compare against in order to predict a perceptual quality score. The contributions of this paper are threefold. 1) We propose a spatio-temporal natural scene statistics (NSS) model for videos. 2) We propose a motion model that quantifies motion coherency in video scenes. 3) We show that the proposed NSS and motion coherency models are appropriate for quality assessment of videos, and we utilize them to design a blind VQA algorithm that correlates highly with human judgments of quality. The proposed algorithm, called video BLIINDS, is tested on the LIVE VQA database and on the EPFL-PoliMi video database and shown to perform close to the level of top performing reduced and full reference VQA algorithms.

...read moreread less

383 citations

Journal Article•DOI•

No-reference image quality assessment in curvelet domain

[...]

Lixiong Liu¹, Dong Hongping¹, Hua Huang¹, Alan C. Bovik²•Institutions (2)

Beijing Institute of Technology¹, University of Texas at Austin²

01 Apr 2014-Signal Processing-image Communication

TL;DR: The resulting algorithm, dubbed CurveletQA, correlates well with human subjective opinions of image quality, delivering performance that is competitive with popular full-reference IQA algorithms such as SSIM, and with top-performing NR IQA models.

...read moreread less

Abstract: We study the efficacy of utilizing a powerful image descriptor, the curvelet transform, to learn a no-reference (NR) image quality assessment (IQA) model. A set of statistical features are extracted from a computed image curvelet representation, including the coordinates of the maxima of the log-histograms of the curvelet coefficients values, and the energy distributions of both orientation and scale in the curvelet domain. Our results indicate that these features are sensitive to the presence and severity of image distortion. Operating within a 2-stage framework of distortion classification followed by quality assessment, we train an image distortion and quality prediction engine using a support vector machine (SVM). The resulting algorithm, dubbed CurveletQA for short, was tested on the LIVE IQA database and compared to state-of-the-art NR/FR IQA algorithms. We found that CurveletQA correlates well with human subjective opinions of image quality, delivering performance that is competitive with popular full-reference (FR) IQA algorithms such as SSIM, and with top-performing NR IQA models. At the same time, CurveletQA has a relatively low complexity.

...read moreread less

176 citations

Journal Article•DOI•

C-DIIVINE: No-reference image quality assessment based on local magnitude and phase statistics of natural scenes

[...]

Yi Zhang¹, Anush K. Moorthy², Damon M. Chandler¹, Alan C. Bovik²•Institutions (2)

Oklahoma State University–Stillwater¹, University of Texas at Austin²

01 Aug 2014-Signal Processing-image Communication

TL;DR: This paper presents a complex extension of the DIIVINE algorithm (called C-DIIVINE), which blindly assesses image quality based on the complex Gaussian scale mixture model corresponding to the complex version of the steerable pyramid wavelet transform.

...read moreread less

Abstract: It is widely known that the wavelet coefficients of natural scenes possess certain statistical regularities which can be affected by the presence of distortions. The DIIVINE (Distortion Identification-based Image Verity and Integrity Evaluation) algorithm is a successful no-reference image quality assessment (NR IQA) algorithm, which estimates quality based on changes in these regularities. However, DIIVINE operates based on real-valued wavelet coefficients, whereas the visual appearance of an image can be strongly determined by both the magnitude and phase information. In this paper, we present a complex extension of the DIIVINE algorithm (called C-DIIVINE), which blindly assesses image quality based on the complex Gaussian scale mixture model corresponding to the complex version of the steerable pyramid wavelet transform. Specifically, we applied three commonly used distribution models to fit the statistics of the wavelet coefficients: (1) the complex generalized Gaussian distribution is used to model the wavelet coefficient magnitudes, (2) the generalized Gaussian distribution is used to model the [email protected]? relative magnitudes, and (3) the wrapped Cauchy distribution is used to model the [email protected]? relative phases. All these distributions have characteristic shapes that are consistent across different natural images but change significantly in the presence of distortions. We also employ the complex wavelet structural similarity index to measure degradation of the correlations across image scales, which serves as an important indicator of the [email protected]? energy distribution and the loss of alignment of local spectral components contributing to image structure. Experimental results show that these complex extensions allow C-DIIVINE to yield a substantial improvement in predictive performance as compared to its predecessor, and highly competitive performance relative to other recent no-reference algorithms.

...read moreread less

106 citations

Journal Article•DOI•

Modeling the Time—Varying Subjective Quality of HTTP Video Streams With Rate Adaptations

[...]

Chao Chen¹, Lark Kwon Choi¹, Gustavo de Veciana¹, Constantine Caramanis¹, Robert W. Heath¹, Alan C. Bovik¹ - Show less +2 more•Institutions (1)

University of Texas at Austin¹

19 Mar 2014-IEEE Transactions on Image Processing

TL;DR: A Hammerstein-Wiener model is presented for predicting the time-varying subjective quality (TVSQ) of rate-adaptive videos and it is shown that the model is able to reliably predict the TVSQ of rate adaptive videos.

...read moreread less

Abstract: Newly developed hypertext transfer protocol (HTTP)-based video streaming technologies enable flexible rate-adaptation under varying channel conditions. Accurately predicting the users' quality of experience (QoE) for rate-adaptive HTTP video streams is thus critical to achieve efficiency. An important aspect of understanding and modeling QoE is predicting the up-to-the-moment subjective quality of a video as it is played, which is difficult due to hysteresis effects and nonlinearities in human behavioral responses. This paper presents a Hammerstein-Wiener model for predicting the time-varying subjective quality (TVSQ) of rate-adaptive videos. To collect data for model parameterization and validation, a database of longer duration videos with time-varying distortions was built and the TVSQs of the videos were measured in a large-scale subjective study. The proposed method is able to reliably predict the TVSQ of rate adaptive videos. Since the Hammerstein-Wiener model has a very simple structure, the proposed method is suitable for online TVSQ prediction in HTTP-based streaming.

...read moreread less

97 citations

Journal Article•DOI•

Saliency Prediction on Stereoscopic Videos

[...]

Hak-Sub Kim¹, Sanghoon Lee¹, Alan C. Bovik²•Institutions (2)

Center for Information Technology¹, University of Texas at Austin²

01 Apr 2014-IEEE Transactions on Image Processing

TL;DR: A new 3D saliency prediction model that accounts for diverse low-level luminance, chrominance, motion, and depth attributes of 3D videos as well as high-level classifications of scenes by type is described.

...read moreread less

Abstract: We describe a new 3D saliency prediction model that accounts for diverse low-level luminance, chrominance, motion, and depth attributes of 3D videos as well as high-level classifications of scenes by type. The model also accounts for perceptual factors, such as the nonuniform resolution of the human eye, stereoscopic limits imposed by Panum's fusional area, and the predicted degree of (dis) comfort felt, when viewing the 3D video. The high-level analysis involves classification of each 3D video scene by type with regard to estimated camera motion and the motions of objects in the videos. Decisions regarding the relative saliency of objects or regions are supported by data obtained through a series of eye-tracking experiments. The algorithm developed from the model elements operates by finding and segmenting salient 3D space-time regions in a video, then calculating the saliency strength of each segment using measured attributes of motion, disparity, texture, and the predicted degree of visual discomfort experienced. The saliency energy of both segmented objects and frames are weighted using models of human foveation and Panum's fusional area yielding a single predictor of 3D saliency.

...read moreread less

95 citations

Journal Article•DOI•

3D Visual Discomfort Prediction: Vergence , Foveation, and the Physiological Optics of Accommodation

[...]

Jincheol Park¹, Sanghoon Lee¹, Alan C. Bovik²•Institutions (2)

Yonsei University¹, University of Texas at Austin²

14 Mar 2014-IEEE Journal of Selected Topics in Signal Processing

TL;DR: The 3D-AVM Predictor accounts for anomalous motor responses of both accommodation and vergence, yielding predictive power that is statistically superior to prior models that rely on a computed disparity distribution only.

...read moreread less

Abstract: To achieve clear binocular vision, neural processes that accomplish accommodation and vergence are performed via two collaborative, cross-coupled processes: accommodation-vergence (AV) and vergence-accommodation (VA). However, when people watch stereo images on stereoscopic displays, normal neural functioning may be disturbed owing to anomalies of the cross-link gains. These anomalies are likely the main cause of visual discomfort experienced when viewing stereo images, and are called Accommodation-Vergence Mismatches (AVM). Moreover, the absence of any useful accommodation depth cues when viewing 3D content on a flat panel (planar) display induces anomalous demands on binocular fusion, resulting in possible additional visual discomfort. Most prior efforts in this direction have focused on predicting anomalies in the AV cross-link using measurements on a computed disparity map. We further these contributions by developing a model that accounts for both accommodation and vergence, resulting in a new visual discomfort prediction algorithm dubbed the 3D-AVM Predictor. The 3D-AVM model and algorithm make use of a new concept we call local 3D bandwidth (BW) which is defined in terms of the physiological optics of binocular vision and foveation. The 3D-AVM Predictor accounts for anomalous motor responses of both accommodation and vergence, yielding predictive power that is statistically superior to prior models that rely on a computed disparity distribution only.

...read moreread less

65 citations

Journal Article•DOI•

Temporal Video Quality Model Accounting for Variable Frame Delay Distortions

[...]

Margaret H. Pinson, Lark Kwon Choi¹, Alan C. Bovik¹•Institutions (1)

University of Texas at Austin¹

01 Dec 2014-IEEE Transactions on Broadcasting

TL;DR: A new video quality model (VQM) that accounts for the perceptual impact of variable frame delays (VFD) in videos with demonstrated top performance on the laboratory for image and video engineering (LIVE) mobile video quality assessment ( VQA) database.

...read moreread less

Abstract: We announce a new video quality model (VQM) that accounts for the perceptual impact of variable frame delays (VFD) in videos with demonstrated top performance on the laboratory for image and video engineering (LIVE) mobile video quality assessment (VQA) database. This model, called VQM_VFD, uses perceptual features extracted from spatial-temporal blocks spanning fixed angular extents and a long edge detection filter. VQM_VFD predicts video quality by measuring multiple frame delays using perception based parameters to track subjective quality over time. In the performance analysis of VQM_VFD, we evaluated its efficacy at predicting human opinions of visual quality. A detailed correlation analysis and statistical hypothesis testing show that VQM_VFD accurately predicts human subjective judgments and substantially outperforms top-performing image quality assessment and VQA models previously tested on the LIVE mobile VQA database. VQM_VFD achieved the best performance on the mobile and tablet studies of the LIVE mobile VQA database for simulated compression, wireless packet-loss, and rate adaptation, but not for temporal dynamics. These results validate the new model and warrant a hard release of the VQM_VFD algorithm. It is freely available for any purpose, commercial, or noncommercial at http://www.its.bldrdoc.gov/vqm/ .

...read moreread less

60 citations

Proceedings Article•DOI•

Study of the effects of stalling events on the quality of experience of mobile streaming videos

[...]

Deepti Ghadiyaram¹, Alan C. Bovik¹, Hojatollah Yeganeh, Roman Kordasiewicz, M.D. Gallant - Show less +1 more•Institutions (1)

University of Texas at Austin¹

05 Feb 2014

TL;DR: A new mobile video database that models distortions caused by network impairments and is making the database publicly available in order to help advance state-of-the-art research on user-centric mobile network planning and management.

...read moreread less

Abstract: We have created a new mobile video database that models distortions caused by network impairments. In particular, we simulate stalling events and startup delays in over-the-top (OTT) mobile streaming videos. We describe the way we simulated diverse stalling events to create a corpus of distorted videos and the human study we conducted to obtain subjective scores. We also analyzed the ratings to understand the impact of several factors that influence the quality of experience (QoE). To the best of our knowledge, ours is the most comprehensive and diverse study on the effects of stalling events on QoE. We are making the database publicly available [1] in order to help advance state-of-the-art research on user-centric mobile network planning and management.

...read moreread less

Proceedings Article•DOI•

Blind image quality assessment on real distorted images using deep belief nets

[...]

Deepti Ghadiyaram¹, Alan C. Bovik¹•Institutions (1)

University of Texas at Austin¹

05 Feb 2014

TL;DR: A novel natural-scene-statistics-based blind image quality assessment model that is created by training a deep belief net to discover good feature representations that are used to learn a regressor for quality prediction is presented.

...read moreread less

Abstract: We present a novel natural-scene-statistics-based blind image quality assessment model that is created by training a deep belief net to discover good feature representations that are used to learn a regressor for quality prediction. The proposed deep model has an unsupervised pre-training stage followed by a supervised fine-tuning stage, enabling it to generalize over different distortion types, mixtures, and severities. We evaluated our new model on a recently created database of images afflicted by real distortions, and show that it outperforms current state-of-the-art blind image quality prediction models.

...read moreread less

Journal Article•DOI•

No-reference image blur index based on singular value curve

[...]

Qingbing Sang¹, Huixin Qi¹, Xiaojun Wu¹, Chaofeng Li¹, Alan C. Bovik² - Show less +1 more•Institutions (2)

Jiangnan University¹, University of Texas at Austin²

01 Oct 2014-Journal of Visual Communication and Image Representation

TL;DR: Experimental results obtained show that the proposed SVC algorithm achieves high correlation against human judgments when assessing the blur distortion of images and is well-suited for real-time applications.

...read moreread less

Journal Article•DOI•

A fully automated microfluidic femtosecond laser axotomy platform for nerve regeneration studies in C. elegans.

[...]

Sertan Kutal Gokce¹, Samuel X. Guo¹, Navid Ghorashian¹, W. Neil Everett¹, Travis Jarrell¹, Aubri Kottek¹, Alan C. Bovik¹, Adela Ben-Yakar¹ - Show less +4 more•Institutions (1)

University of Texas at Austin¹

03 Dec 2014-PLOS ONE

TL;DR: A fully automated microfluidic platform for performing laser axotomies of fluorescently tagged neurons in living Caenorhabditis elegans establishes a promising methodology for prospective genome-wide screening of nerve regeneration in C. elegans in a truly high-throughput manner.

...read moreread less

Abstract: Femtosecond laser nanosurgery has been widely accepted as an axonal injury model, enabling nerve regeneration studies in the small model organism, Caenorhabditis elegans. To overcome the time limitations of manual worm handling techniques, automation and new immobilization technologies must be adopted to improve throughput in these studies. While new microfluidic immobilization techniques have been developed that promise to reduce the time required for axotomies, there is a need for automated procedures to minimize the required amount of human intervention and accelerate the axotomy processes crucial for high-throughput. Here, we report a fully automated microfluidic platform for performing laser axotomies of fluorescently tagged neurons in living Caenorhabditis elegans. The presented automation process reduces the time required to perform axotomies within individual worms to ∼17 s/worm, at least one order of magnitude faster than manual approaches. The full automation is achieved with a unique chip design and an operation sequence that is fully computer controlled and synchronized with efficient and accurate image processing algorithms. The microfluidic device includes a T-shaped architecture and three-dimensional microfluidic interconnects to serially transport, position, and immobilize worms. The image processing algorithms can identify and precisely position axons targeted for ablation. There were no statistically significant differences observed in reconnection probabilities between axotomies carried out with the automated system and those performed manually with anesthetics. The overall success rate of automated axotomies was 67.4±3.2% of the cases (236/350) at an average processing rate of 17.0±2.4 s. This fully automated platform establishes a promising methodology for prospective genome-wide screening of nerve regeneration in C. elegans in a truly high-throughput manner.

...read moreread less

Journal Article•DOI•

3D Visual Activity Assessment Based on Natural Scene Statistics

[...]

Kwanghyun Lee¹, Anush K. Moorthy², Sanghoon Lee¹, Alan C. Bovik²•Institutions (2)

Center for Information Technology¹, University of Texas at Austin²

01 Jan 2014-IEEE Transactions on Image Processing

TL;DR: A new framework for quantifying 3D visual information is proposed that is applied to the problem of predicting visual fatigue experienced when viewing 3D displays, and the 3DVA utilizes the empirical distortions of wavelet coefficients to a parametric generalized Gaussian probability distribution model and a set of 3D perceptual weights.

...read moreread less

Abstract: One of the most challenging ongoing issues in the field of 3D visual research is how to perceptually quantify object and surface visualizations that are displayed within a virtual 3D space between a human eye and 3D display. To seek an effective method of quantification, it is necessary to measure various elements related to the perception of 3D objects at different depths. We propose a new framework for quantifying 3D visual information that we call 3D visual activity (3DVA), which utilizes natural scene statistics measured over 3D visual coordinates. We account for important aspects of 3D perception by carrying out a 3D coordinate transform reflecting the nonuniform sampling resolution of the eye and the process of stereoscopic fusion. The 3DVA utilizes the empirical distortions of wavelet coefficients to a parametric generalized Gaussian probability distribution model and a set of 3D perceptual weights. We conducted a series of simulations that demonstrate the effectiveness of the 3DVA for quantifying the statistical dynamics of visual 3D space with respect to disparity, motion, texture, and color. A successful example application is also provided, whereby 3DVA is applied to the problem of predicting visual fatigue experienced when viewing 3D displays.

...read moreread less

Journal Article•DOI•

Multimodal Interactive Continuous Scoring of Subjective 3D Video Quality of Experience

[...]

Taewan Kim¹, Jiwoo Kang¹, Sanghoon Lee¹, Alan C. Bovik²•Institutions (2)

Yonsei University¹, University of Texas at Austin²

01 Feb 2014-IEEE Transactions on Multimedia

TL;DR: It is found that 3D quality of experience (QoE) assessment results obtained using MICSQ are more reliable over a wide dynamic range of content than obtained by the conventional single stimulus continuous quality evaluation (SSCQE) protocol.

...read moreread less

Abstract: People experience a variety of 3D visual programs, such as 3D cinema, 3D TV and 3D games, making it necessary to deploy reliable methodologies for predicting each viewer's subjective experience. We propose a new methodology that we call multimodal interactive continuous scoring of quality (MICSQ). MICSQ is composed of a device interaction process between the 3D display and a separate device (PC, tablet, etc.) used as an assessment tool, and a human interaction process between the subject(s) and the separate device. The scoring process is multimodal, using aural and tactile cues to help engage and focus the subject(s) on their tasks by enhancing neuroplasticity. Recorded human responses to 3D visualizations obtained via MICSQ correlate highly with measurements of spatial and temporal activity in the 3D video content. We have also found that 3D quality of experience (QoE) assessment results obtained using MICSQ are more reliable over a wide dynamic range of content than obtained by the conventional single stimulus continuous quality evaluation (SSCQE) protocol. Moreover, the wireless device interaction process makes it possible for multiple subjects to assess 3D QoE simultaneously in a large space such as a movie theater, at different viewing angles and distances. We conducted a series of interesting 3D experiments showing the accuracy and versatility of the new system, while yielding new findings on visual comfort in terms of disparity, motion and an interesting relation between the naturalness and depth of field (DOF) of a stereo camera.

...read moreread less

Journal Article•DOI•

No-Reference Sharpness Assessment of Camera-Shaken Images by Analysis of Spectral Structure

[...]

Taegeun Oh¹, Jincheol Park¹, Kalpana Seshadrinathan², Sanghoon Lee¹, Alan C. Bovik³ - Show less +1 more•Institutions (3)

Yonsei University¹, Intel², University of Texas at Austin³

23 Oct 2014-IEEE Transactions on Image Processing

TL;DR: This work has developed a no-reference framework for automatically predicting the perceptual quality of camera-shaken images based on their spectral statistics, and demonstrates the performance of an algorithm derived from these features on new and existing databases of images distorted by camera shake.

...read moreread less

Abstract: The tremendous explosion of image-, video-, and audio-enabled mobile devices, such as tablets and smart-phones in recent years, has led to an associated dramatic increase in the volume of captured and distributed multimedia content. In particular, the number of digital photographs being captured annually is approaching 100 billion in just the U.S. These pictures are increasingly being acquired by inexperienced, casual users under highly diverse conditions leading to a plethora of distortions, including blur induced by camera shake. In order to be able to automatically detect, correct, or cull images impaired by shake-induced blur, it is necessary to develop distortion models specific to and suitable for assessing the sharpness of camera-shaken images. Toward this goal, we have developed a no-reference framework for automatically predicting the perceptual quality of camera-shaken images based on their spectral statistics. Two kinds of features are defined that capture blur induced by camera shake. One is a directional feature, which measures the variation of the image spectrum across orientations. The second feature captures the shape, area, and orientation of the spectral contours of camera shaken images. We demonstrate the performance of an algorithm derived from these features on new and existing databases of images distorted by camera shake.

...read moreread less

Proceedings Article•DOI•

Crowdsourced study of subjective image quality

[...]

Deepti Ghadiyaram¹, Alan C. Bovik¹•Institutions (1)

University of Texas at Austin¹

01 Nov 2014

TL;DR: A new image quality database that models diverse authentic image distortions and artifacts that affect images that are captured using modern mobile devices and a new online crowdsourcing system, which is using to conduct a very large-scale, on-going, multi-month image quality assessment (IQA) subjective study.

...read moreread less

Abstract: We designed and created a new image quality database that models diverse authentic image distortions and artifacts that affect images that are captured using modern mobile devices. We also designed and implemented a new online crowdsourcing system, which we are using to conduct a very large-scale, on-going, multi-month image quality assessment (IQA) subjective study, wherein a wide range of diverse observers record their judgments of image quality. Our database currently consists of over 320,000 opinion scores on 1,163 authentically distorted images evaluated by over 7000 human observers. The new database will soon be made freely available for download and we envision that the fruits of our efforts will provide researchers with a valuable tool to benchmark and improve the performance of objective IQA algorithms.

...read moreread less

Proceedings Article•DOI•

Delivery quality score model for Internet video

[...]

Hojatollah Yeganeh, Roman Kordasiewicz, M.D. Gallant, Deepti Ghadiyaram¹, Alan C. Bovik¹ - Show less +1 more•Institutions (1)

University of Texas at Austin¹

28 Jan 2014

TL;DR: This paper introduces an objective model called the delivery quality score (DQS) model, to predict user's QoE in the presence of such impairments, and demonstrates that the DQS model correlates highly with the subjective data and that it outperforms other emerging models.

...read moreread less

Abstract: The vast majority of today's internet video services are consumed over-the-top (OTT) via reliable streaming (HTTP via TCP), where the primary noticeable delivery-related impairments are startup delay and stalling. In this paper we introduce an objective model called the delivery quality score (DQS) model, to predict user's QoE in the presence of such impairments. We describe a large subjective study that we carried out to tune and validate this model. Our experiments demonstrate that the DQS model correlates highly with the subjective data and that it outperforms other emerging models.

...read moreread less

Journal Article•DOI•

Face Detection on Distorted Images Augmented by Perceptual Quality-Aware Features

[...]

Suriya Gunasekar¹, Joydeep Ghosh¹, Alan C. Bovik¹•Institutions (1)

University of Texas at Austin¹

01 Dec 2014-IEEE Transactions on Information Forensics and Security

TL;DR: A new set of features, called qualHOG, are proposed for robust facedetection that augments face-indicative Histogram of Oriented Gradients features with perceptual quality-aware spatial Natural Scene Statistics features, and provide statistically significant improvement in tolerance to image distortions over a strong baseline.

...read moreread less

Abstract: Motivated by the proliferation of low-cost digital cameras in mobile devices being deployed in automated surveillance networks, we study the interaction between perceptual image quality and a classic computer vision task of face detection. We quantify the degradation in performance of a popular and effective face detector when human-perceived image quality is degraded by distortions commonly occurring in capture, storage, and transmission of facial images, including noise, blur, and compression. It is observed that, within a certain range of perceived image quality, a modest increase in image quality can drastically improve face detection performance. These results can be used to guide resource or bandwidth allocation in acquisition or communication/delivery systems that are associated with face detection tasks. A new set of features, called qualHOG, are proposed for robust face-detection that augments face-indicative Histogram of Oriented Gradients (HOG) features with perceptual quality-aware spatial Natural Scene Statistics (NSS) features. Face detectors trained on these new features provide statistically significant improvement in tolerance to image distortions over a strong baseline. Distortion-dependent and distortion-unaware variants of the face detectors are proposed and evaluated on a large database of face images representing a wide range of distortions. A biased variant of the training algorithm is also proposed that further enhances the robustness of these face detectors. To facilitate this research, we created a new distorted face database (DFD), containing face and non-face patches from images impaired by a variety of common distortion types and levels. This new data set and relevant code are available for download and further experimentation at www.live.ece.utexas.edu/research/Quality/index.htm.

...read moreread less

Proceedings Article•DOI•

Bivariate statistical modeling of color and range in natural scenes

[...]

Che-Chun Su¹, Lawrence K. Cormack¹, Alan C. Bovik¹•Institutions (1)

University of Texas at Austin¹

25 Feb 2014-Proceedings of SPIE

TL;DR: By utilizing robust bivariate models, this work is able to incorporate measurements of bivariate statistics between spatially adjacent luminance/chrominance and range information into various 3D image/video and computer vision applications, e.g., quality assessment, 2D-to-3D conversion, etc.

...read moreread less

Abstract: The statistical properties embedded in visual stimuli from the surrounding environment guide and affect the evolutionary processes of human vision systems. There are strong statistical relationships between co-located luminance/chrominance and disparity bandpass coefficients in natural scenes. However, these statistical rela- tionships have only been deeply developed to create point-wise statistical models, although there exist spatial dependencies between adjacent pixels in both 2D color images and range maps. Here we study the bivariate statistics of the joint and conditional distributions of spatially adjacent bandpass responses on both luminance/chrominance and range data of naturalistic scenes. We deploy bivariate generalized Gaussian distributions to model the underlying statistics. The analysis and modeling results show that there exist important and useful statistical properties of both joint and conditional distributions, which can be reliably described by the corresponding bivariate generalized Gaussian models. Furthermore, by utilizing these robust bivariate models, we are able to incorporate measurements of bivariate statistics between spatially adjacent luminance/chrominance and range information into various 3D image/video and computer vision applications, e.g., quality assessment, 2D-to-3D conversion, etc.

...read moreread less

Proceedings Article•DOI•

Referenceless perceptual fog density prediction model

[...]

Lark Kwon Choi¹, Jaehee You², Alan C. Bovik¹•Institutions (2)

University of Texas at Austin¹, Hongik University²

25 Feb 2014-Proceedings of SPIE

TL;DR: A perceptual fog density prediction model based on natural scene statistics and “fog aware” statistical features, which can predict the visibility in a foggy scene from a single image without reference to a corresponding fogless image, without side geographical camera information, and without training on human-rated judgments.

...read moreread less

Abstract: We propose a perceptual fog density prediction model based on natural scene statistics (NSS) and “fog aware” statistical features, which can predict the visibility in a foggy scene from a single image without reference to a corresponding fogless image, without side geographical camera information, without training on human-rated judgments, and without dependency on salient objects such as lane markings or traffic signs. The proposed fog density predictor only makes use of measurable deviations from statistical regularities observed in natural foggy and fog-free images. A fog aware collection of statistical features is derived from a corpus of foggy and fog-free images by using a space domain NSS model and observed characteristics of foggy images such as low contrast, faint color, and shifted intensity. The proposed model not only predicts perceptual fog density for the entire image but also provides a local fog density index for each patch. The predicted fog density of the model correlates well with the measured visibility in a foggy scene as measured by judgments taken in a human subjective study on a large foggy image database. As one application, the proposed model accurately evaluates the performance of defog algorithms designed to enhance the visibility of foggy images.

...read moreread less

Journal Article•DOI•

Blind image quality assessment using a reciprocal singular value curve

[...]

Qingbing Sang¹, Xiaojun Wu¹, Chaofeng Li¹, Alan C. Bovik²•Institutions (2)

Jiangnan University¹, University of Texas at Austin²

01 Nov 2014-Signal Processing-image Communication

TL;DR: Two new general blind image quality assessment (IQA) indices that respectively use the area and curvature of image reciprocal singular value curves are described, which can handle multiple unknown distortions and are no-training methods.

...read moreread less

Abstract: The reciprocal singular value curves of natural images resemble inverse power functions. The bending degree of the reciprocal singular value curve varies with distortion type and severity. We describe two new general blind image quality assessment (IQA) indices that respectively use the area and curvature of image reciprocal singular value curves. These two methods almost require very little prior knowledge of any image or distortion nor any process of training, and they can handle multiple unknown distortions, hence they are no-training methods. Experimental results on five simulated databases show that the proposed algorithms deliver quality predictions that have high correlation with human subjective judgments, and that are competitive with other blind IQA models. We found out relationship between image distortion and reciprocal singular value curve.We constructed two new general blind image quality assessment (IQA) indices that respectively use the area and curvature of image reciprocal singular value curves.The proposed indices have the following advantages. (1) Simple mathematical expression leads to low computational complexity; (2) they can be applied to more distorted categories, such as "High frequency noise," and "WN-color."

...read moreread less

Proceedings Article•DOI•

Binocular mismatch induced by luminance discrepancies on stereoscopic images

[...]

Jianyu Chen¹, Jun Zhou¹, Jun Sun¹, Alan C. Bovik²•Institutions (2)

Shanghai Jiao Tong University¹, University of Texas at Austin²

14 Jul 2014

TL;DR: The experimental results show that the combination of binocular contrast, structural dissimilarity and average luminance exhibits high consistency with subjective scores of visual discomfort, fusion difficulty and overall binocular mismatches in terms of Spearman's Rank Ordered Correlation Coefficient.

...read moreread less

Abstract: Luminance discrepancies between image pairs occur owing to inconsistent parameters between stereoscopic camera devices and from imperfect capture conditions. Such discrepancies induce binocular mismatches and affect the visual comfort that is felt by viewers, as well as their ability to fuse stereoscopic. To better understand and observe this effect, we built a stereoscopic images database of 240 luminance discrepancy images and 30 natural images with subjective scores of visual discomfort and fusion difficulty. Two features, binocular contrast and luminance similarity were extracted to analyze the relationship between the subjective scores and the luminance discrepancies. Structural dissimilarity and average luminance are used to predict the effects of binocular mismatches. The experimental results show that the combination of binocular contrast, structural dissimilarity and average luminance exhibits high consistency with subjective scores of visual discomfort, fusion difficulty and overall binocular mismatches in terms of Spearman's Rank Ordered Correlation Coefficient.

...read moreread less

Proceedings Article•DOI•

Referenceless perceptual image defogging

[...]

Lark Kwon Choi¹, Jaehee You², Alan C. Bovik¹•Institutions (2)

University of Texas at Austin¹, Hongik University²

06 Apr 2014

TL;DR: The proposed defog and visibility enhancer makes use of statistical regularities observed in foggy and fog-free images to extract the most visible information from three processed image results: one white balanced and two contrast enhanced images.

...read moreread less

Abstract: We propose a referenceless perceptual defog and visibility enhancement model based on multiscale “fog aware” statistical features. Our model operates on a single foggy image and uses a set of “fog aware” weight maps to improve the visibility of foggy regions. The proposed defog and visibility enhancer makes use of statistical regularities observed in foggy and fog-free images to extract the most visible information from three processed image results: one white balanced and two contrast enhanced images. Perceptual fog density, fog aware luminance, contrast, saturation, chrominance, and saliency weight maps smoothly blend these via a Laplacian pyramid. Evaluation on a variety of foggy images shows that the proposed model achieves better results for darker, denser foggy images as well as on standard defog test images.

...read moreread less

Proceedings Article•DOI•

No-reference task performance prediction on distorted LWIR images

[...]

Todd Goodall¹, Alan C. Bovik¹•Institutions (1)

University of Texas at Austin¹

06 Apr 2014

TL;DR: The application of modern blind IQA models are extended to study whether quality prediction on other image modality can find practical use.

...read moreread less

Abstract: Recent work on the problem of Image Quality Assessment (IQA) has produced accurate subjective quality evaluators for visible light images. Two such algorithms are the Blind/Referenceless Image Spatial QUality Evaluator (BRISQUE) and the Natural Image Quality Evaluator (NIQE). Both models are useful in that they correlate highly with human visual perception of image quality. Given that other kinds of non-visible light images are also 'natural' projections of the world, and can be distorted thereby reducing the perceived quality, it is of interest to study whether quality prediction on other image modality can find practical use. To this end we have extended the application of modern blind IQA models.

...read moreread less

Journal Article•DOI•

Spatiotemporal Flicker Detector Model of Motion Silencing

[...]

Lark Kwon Choi¹, Alan C. Bovik¹, Lawrence K. Cormack¹•Institutions (1)

University of Texas at Austin¹

01 Dec 2014-Perception

TL;DR: A simple filter-based model is created that successfully captures the psychophysical data over a wide range of velocities and flicker frequencies and finds that the threshold of silencing occurs when the log frequency of object replacement is roughly one quarter of the log flicker frequency.

...read moreread less

Abstract: Motion can impair the perception of other visual changes. Suchow and Alvarez (2011a, Current Biology, 21, 140-143) recently demonstrated a striking 'motion silencing' illusion, in which the salient changes among a group of objects' luminances (or colors, etc) appear to cease in the presence of large, coherent object motion. To understand why the visual system might be insensitive to changes in object luminances ('flicker') in the presence of object motion, we constructed similar stimuli and did a systematic spectral analysis of them. We conducted human psychophysical experiments to examine motion silencing as a function of stimulus velocity, flicker frequency, and spacing; and we created a simple filter-based model as a working hypothesis of motion silencing. From the results, we found that the threshold of silencing occurs when the log frequency of object replacement is roughly one quarter of the log flicker frequency (the mean slope is approximately 0.27). The dependence of silencing on object spacing may be explained as a phenomenon of temporal sampling of the stimuli by the visual system. Our proposed model successfully captures the psychophysical data over a wide range of velocities and flicker frequencies.

...read moreread less

Journal Article•DOI•

Stereoscopic interpretation of low-dose breast tomosynthesis projection images.

[...]

Gautam S. Muralidhar¹, Mia K. Markey¹, Mia K. Markey², Alan C. Bovik¹, Tamara Miner Haygood², Tanya W. Stephens², William R. Geiser², Naveen Garg², Beatriz E. Adrada², Basak E. Dogan², Selin Carkaci³, Raunak Khisty⁴, Gary J. Whitman² - Show less +9 more•Institutions (4)

University of Texas at Austin¹, University of Texas MD Anderson Cancer Center², Ohio State University³, Wake Forest University⁴

01 Apr 2014-Journal of Digital Imaging

TL;DR: The results from this study suggest that radiologists who can perceive stereo can reliably interpret breast tomosynthesis projection images using stereoscopic viewing.

...read moreread less

Abstract: The purpose of this study was to evaluate stereoscopic perception of low-dose breast tomosynthesis projection images. In this Institutional Review Board exempt study, craniocaudal breast tomosynthesis cases (N = 47), consisting of 23 biopsy-proven malignant mass cases and 24 normal cases, were retrospectively reviewed. A stereoscopic pair comprised of two projection images that were ±4° apart from the zero angle projection was displayed on a Planar PL2010M stereoscopic display (Planar Systems, Inc., Beaverton, OR, USA). An experienced breast imager verified the truth for each case stereoscopically. A two-phase blinded observer study was conducted. In the first phase, two experienced breast imagers rated their ability to perceive 3D information using a scale of 1–3 and described the most suspicious lesion using the BI-RADS® descriptors. In the second phase, four experienced breast imagers were asked to make a binary decision on whether they saw a mass for which they would initiate a diagnostic workup or not and also report the location of the mass and provide a confidence score in the range of 0–100. The sensitivity and the specificity of the lesion detection task were evaluated. The results from our study suggest that radiologists who can perceive stereo can reliably interpret breast tomosynthesis projection images using stereoscopic viewing.

...read moreread less

Journal Article•DOI•

Study of no-reference image quality assessment algorithms on printed images

[...]

Tuomas Eerola¹, Lasse Lensu¹, Heikki Kälviäinen¹, Alan C. Bovik²•Institutions (2)

Lappeenranta University of Technology¹, University of Texas at Austin²

01 Nov 2014-Journal of Electronic Imaging

TL;DR: The best methods are shown to accurately predict subjective opinions of the quality of printed photographs using data from a psychometric study.

...read moreread less

Abstract: Measuring the visual quality of printed media is important since printed products have an important role in everyday life. Finding ways to automatically predict the image quality has been an active research topic in digital image processing, but adapting those methods to measure the visual quality of printed media has not been studied often or in depth and is not straightforward. Here, we analyze the efficacy of no-reference image quality assessment (IQA) algorithms originally developed for digital IQA with regards to predicting the perceived quality of printed natural images. We perform a comprehensive statistical comparison of the methods. The best methods are shown to accurately predict subjective opinions of the quality of printed photographs using data from a psychometric study. © The Authors. Published by SPIE under a Creative Commons Attribution 3.0 Unported License. Distribution or reproduction of this work in whole or in part requires full attribution of the original publication, including its DOI. (DOI: 10 .1117/1.JEI.23.6.061106)

...read moreread less

Proceedings Article•DOI•

Adaptive video transmission with subjective quality constraints

[...]

Chao Chen¹, Xiaoqing Zhu², Gustavo de Veciana¹, Alan C. Bovik¹, Robert W. Heath¹ - Show less +1 more•Institutions (2)

University of Texas at Austin¹, Cisco Systems, Inc.²

28 Jan 2014

TL;DR: A rate-adaptation algorithm is proposed that can incorporate QoE constraints on the empirical cumulative quality distribution per user and can reduce network resource consumption over conventional average-quality maximized rate- Adaptation algorithms.

...read moreread less

Abstract: We conducted a subjective study wherein we found that viewers' Quality of Experience (QoE) was strongly correlated with the empirical cumulative distribution function (eCDF) of the predicted video quality. Based on this observation, we propose a rate-adaptation algorithm that can incorporate QoE constraints on the empirical cumulative quality distribution per user. Simulation results show that the proposed technique can reduce network resource consumption by 29% over conventional average-quality maximized rate-adaptation algorithms.

...read moreread less