scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Blind Quality Estimation by Disentangling Perceptual and Noisy Features in High Dynamic Range Images

TL;DR: The proposed NR-IQA model is able to detect visual artifacts, taking into consideration perceptual masking effects, in a distorted HDR image without any reference, and predict HDR image quality as accurately as state-of-the-art full-reference IQA methods.
Abstract: High dynamic range (HDR) image visual quality assessment in the absence of a reference image is challenging. This research topic has not been adequately studied largely due to the high cost of HDR display devices. Nevertheless, HDR imaging technology has attracted increasing attention, because it provides more realistic content, consistent to what the human visual system perceives. We propose a new no-reference image quality assessment (NR-IQA) model for HDR data based on convolutional neural networks. The proposed model is able to detect visual artifacts, taking into consideration perceptual masking effects, in a distorted HDR image without any reference. The error and perceptual masking values are measured separately, yet sequentially, and then processed by a mixing function to predict the perceived quality of the distorted image. Instead of using simple stimuli and psychovisual experiments, perceptual masking effects are computed from a set of annotated HDR images during our training process. Experimental results demonstrate that our proposed NR-IQA model can predict HDR image quality as accurately as state-of-the-art full-reference IQA methods.
Citations
More filters
Book ChapterDOI
TL;DR: This work explores the possibility of using deep features instead, particularly, the encoded (bottleneck) feature maps of a Convolutional Autoencoder neural network architecture, and meets the primary requirements for an IQA method better than the state-of-art method for three types of distortions.
Abstract: Image Quality Assessment algorithms predict a quality score for a pristine or distorted input image, such that it correlates with human opinion. Traditional methods required a non-distorted "reference" version of the input image to compare with, in order to predict this score. However, recent "No-reference" methods circumvent this requirement by modelling the distribution of clean image features, thereby making them more suitable for practical use. However, majority of such methods either use hand-crafted features or require training on human opinion scores (supervised learning), which are difficult to obtain and standardise. We explore the possibility of using deep features instead, particularly, the encoded (bottleneck) feature maps of a Convolutional Autoencoder neural network architecture. Also, we do not train the network on subjective scores (unsupervised learning). The primary requirements for an IQA method are monotonic increase in predicted scores with increasing degree of input image distortion, and consistent ranking of images with the same distortion type and content, but different distortion levels. Quantitative experiments using the Pearson, Kendall and Spearman correlation scores on a diverse set of images show that our proposed method meets the above requirements better than the state-of-art method (which uses hand-crafted features) for three types of distortions: blurring, noise and compression artefacts. This demonstrates the potential for future research in this relatively unexplored sub-area within IQA.

9 citations

Book ChapterDOI
16 Dec 2019
TL;DR: In this article, the authors explore the possibility of using deep features instead, particularly, the encoded (bottleneck) feature maps of a Convolutional Autoencoder neural network architecture.
Abstract: Image Quality Assessment algorithms predict a quality score for a pristine or distorted input image, such that it correlates with human opinion. Traditional methods required a non-distorted “reference” version of the input image to compare with, in order to predict this score. However, recent “No-reference” methods circumvent this requirement by modelling the distribution of clean image features, thereby making them more suitable for practical use. However, majority of such methods either use hand-crafted features or require training on human opinion scores (supervised learning), which are difficult to obtain and standardise. We explore the possibility of using deep features instead, particularly, the encoded (bottleneck) feature maps of a Convolutional Autoencoder neural network architecture. Also, we do not train the network on subjective scores (unsupervised learning). The primary requirements for an IQA method are monotonic increase in predicted scores with increasing degree of input image distortion, and consistent ranking of images with the same distortion type and content, but different distortion levels. Quantitative experiments using the Pearson, Kendall and Spearman correlation scores on a diverse set of images show that our proposed method meets the above requirements better than the state-of-art method (which uses hand-crafted features) for three types of distortions: blurring, noise and compression artefacts. This demonstrates the potential for future research in this relatively unexplored sub-area within IQA.

9 citations

Journal ArticleDOI
TL;DR: Experimental results in the public ESPL-LIVE HDR database show that the Pearson linear correlation coefficient and Spearman rank order correlation coefficient of the proposed method reach 0.8302 and 0.7887, respectively, which is superior to the state-of-the-art blind TMI quality assessment methods, and it means that the proposal is highly consistent with human visual perception.

8 citations

Proceedings ArticleDOI
01 Oct 2020
TL;DR: Deep No-Reference Quality Metric (NoR-VDPNet), a deeplearning approach that learns to predict the global image quality feature (i.e., the mean-opinion-score index Q) that HDRVDP 2 computes, is proposed.
Abstract: HDR-VDP 2 has convincingly shown to be a reliable metric for image quality assessment, and it is currently playing a remarkable role in the evaluation of complex image processing algorithms. However, HDR-VDP 2 is known to be computationally expensive (both in terms of time and memory) and is constrained to the availability of a ground-truth image (the so-called reference) against to which the quality of a processed imaged is quantified. These aspects impose severe limitations on the applicability of HDR-VDP 2 to realworld scenarios involving large quantities of data or requiring real-time responses. To address these issues, we propose Deep No-Reference Quality Metric (NoR-VDPNet), a deeplearning approach that learns to predict the global image quality feature (i.e., the mean-opinion-score index Q) that HDRVDP 2 computes. NoR-VDPNet is no-reference (i.e., it operates without a ground truth reference) and its computational cost is substantially lower when compared to HDR-VDP 2 (by more than an order of magnitude). We demonstrate the performance of NoR-VDPNet in a variety of scenarios, including the optimization of parameters of a denoiser and JPEG-XT.

8 citations


Cites background from "Blind Quality Estimation by Disenta..."

  • ...Some examples include purely data-driven approaches [17], approaches that learn rules from linguistic descriptions [18], others that extract different gradient-based features [19, 20], or approaches modeling the perceptual masking effects in distorted HDR images [21]....

    [...]

Journal ArticleDOI
Mingxing Jiang1, Liquan Shen1, Linru Zheng1, Min Zhao1, Xuhao Jiang1 
TL;DR: Experimental results show that the proposed tone-mapped images (TMIs) luminance partition model and corresponding quality measure outperforms the state-of-the-art no-reference (NR) methods.
Abstract: Tone mapping operators (TMOs) reproduce the high dynamic range (HDR) images on low dynamic range (LDR) consumer electronics devices such as monitors or printers. To accurately measure and compare the performance of different TMOs, this article proposes a tone-mapped images (TMIs) luminance partition model and corresponding quality measure. First, each tone-mapped (TM) image is segmented into highlight region (HR), dark region (DR) and midtone region (MR) based on luminance partition. Second, local entropies and contrast features are extracted in the HR and DR, and color-based features are captured in the MR. Meanwhile, the gray-level co-occurrence matrix (GLCM) and Canny operator are utilized to measure the microstructural distortions and halo effects, respectively. Finally, all extracted features are combined and trained together with subjective ratings to form a regression model using support vector regression (SVR). Experimental results show that the proposed method outperforms the state-of-the-art no-reference (NR) methods. Specifically, the spearman correlation coefficients (SRCC) values of our method reach 0.83 and 0.76 on the tone-mapped image database (TMID) and the ESPL-LIVE HDR database, respectively.

8 citations


Cites background from "Blind Quality Estimation by Disenta..."

  • ...Some features [23], [24] are designed for describing TMIs regions different luminance levels, achieving a certain effect....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

40,609 citations

Book ChapterDOI
06 Sep 2014
TL;DR: A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.
Abstract: Large Convolutional Network models have recently demonstrated impressive classification performance on the ImageNet benchmark Krizhevsky et al. [18]. However there is no clear understanding of why they perform so well, or how they might be improved. In this paper we explore both issues. We introduce a novel visualization technique that gives insight into the function of intermediate feature layers and the operation of the classifier. Used in a diagnostic role, these visualizations allow us to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark. We also perform an ablation study to discover the performance contribution from different model layers. We show our ImageNet model generalizes well to other datasets: when the softmax classifier is retrained, it convincingly beats the current state-of-the-art results on Caltech-101 and Caltech-256 datasets.

12,783 citations


"Blind Quality Estimation by Disenta..." refers background in this paper

  • ...A detailed analysis of how a generic CNN generates its features is explained in [28]....

    [...]

Proceedings ArticleDOI
09 Nov 2003
TL;DR: This paper proposes a multiscale structural similarity method, which supplies more flexibility than previous single-scale methods in incorporating the variations of viewing conditions, and develops an image synthesis method to calibrate the parameters that define the relative importance of different scales.
Abstract: The structural similarity image quality paradigm is based on the assumption that the human visual system is highly adapted for extracting structural information from the scene, and therefore a measure of structural similarity can provide a good approximation to perceived image quality. This paper proposes a multiscale structural similarity method, which supplies more flexibility than previous single-scale methods in incorporating the variations of viewing conditions. We develop an image synthesis method to calibrate the parameters that define the relative importance of different scales. Experimental comparisons demonstrate the effectiveness of the proposed method.

4,333 citations


"Blind Quality Estimation by Disenta..." refers methods in this paper

  • ...The Tone-Mapped Quality Index (TMQI) metric [13] follows the structural fidelity criterion [14], to compare an HDR image with its tone-mapped version, by embedding the knowledge of the Contrast Sensitivity Function (CSF) at different values of luminance [15], [16]....

    [...]

Journal ArticleDOI
TL;DR: Despite its simplicity, it is able to show that BRISQUE is statistically better than the full-reference peak signal-to-noise ratio and the structural similarity index, and is highly competitive with respect to all present-day distortion-generic NR IQA algorithms.
Abstract: We propose a natural scene statistic-based distortion-generic blind/no-reference (NR) image quality assessment (IQA) model that operates in the spatial domain. The new model, dubbed blind/referenceless image spatial quality evaluator (BRISQUE) does not compute distortion-specific features, such as ringing, blur, or blocking, but instead uses scene statistics of locally normalized luminance coefficients to quantify possible losses of “naturalness” in the image due to the presence of distortions, thereby leading to a holistic measure of quality. The underlying features used derive from the empirical distribution of locally normalized luminances and products of locally normalized luminances under a spatial natural scene statistic model. No transformation to another coordinate frame (DCT, wavelet, etc.) is required, distinguishing it from prior NR IQA approaches. Despite its simplicity, we are able to show that BRISQUE is statistically better than the full-reference peak signal-to-noise ratio and the structural similarity index, and is highly competitive with respect to all present-day distortion-generic NR IQA algorithms. BRISQUE has very low computational complexity, making it well suited for real time applications. BRISQUE features may be used for distortion-identification as well. To illustrate a new practical application of BRISQUE, we describe how a nonblind image denoising algorithm can be augmented with BRISQUE in order to perform blind image denoising. Results show that BRISQUE augmentation leads to performance improvements over state-of-the-art methods. A software release of BRISQUE is available online: http://live.ece.utexas.edu/research/quality/BRISQUE_release.zip for public use and evaluation.

3,780 citations


"Blind Quality Estimation by Disenta..." refers methods or result in this paper

  • ...C. IQA Reference Schemes Since the research on HDR NR-IQA is still at its preliminary stage and there is no generally accepted benchmark metric, we compared our approach with a number of state-of-theart LDR NR-IQA methods: BRISQUE [7], SSEQ [9], BIQI [6], DIIVINE [8], and kCNN [10], with and without preprocessing operators....

    [...]

  • ...Best performances are obtained by using BRISQUE [7] and kCNN [26]....

    [...]

  • ...By itself, BRISQUE, BIQI, SSEQ and DIIVIINE seem to be unable to adapt to the different image sizes and luminance ranges in the testing set, when these features are different from the training set....

    [...]

  • ...BRISQUE [7] computes the Mean Subtracted Contrast Normalized (MSCN) image as feature using MSCN(i, j) = I(i,j)−μI,N,i,j σI,N,i,j+1 , where μI,N,i,j and σI,N,i,j represent the mean and variance computed over a local Gaussian window of size N around the point (i, j)....

    [...]

  • ...The high performances of BRISQUE and kCNN can be attributed to the features they use, i.e., the MSCN coefficients....

    [...]

Journal ArticleDOI
TL;DR: DIIVINE is capable of assessing the quality of a distorted image across multiple distortion categories, as against most NR IQA algorithms that are distortion-specific in nature, and is statistically superior to the often used measure of peak signal-to-noise ratio (PSNR) and statistically equivalent to the popular structural similarity index (SSIM).
Abstract: Our approach to blind image quality assessment (IQA) is based on the hypothesis that natural scenes possess certain statistical properties which are altered in the presence of distortion, rendering them un-natural; and that by characterizing this un-naturalness using scene statistics, one can identify the distortion afflicting the image and perform no-reference (NR) IQA. Based on this theory, we propose an (NR)/blind algorithm-the Distortion Identification-based Image Verity and INtegrity Evaluation (DIIVINE) index-that assesses the quality of a distorted image without need for a reference image. DIIVINE is based on a 2-stage framework involving distortion identification followed by distortion-specific quality assessment. DIIVINE is capable of assessing the quality of a distorted image across multiple distortion categories, as against most NR IQA algorithms that are distortion-specific in nature. DIIVINE is based on natural scene statistics which govern the behavior of natural images. In this paper, we detail the principles underlying DIIVINE, the statistical features extracted and their relevance to perception and thoroughly evaluate the algorithm on the popular LIVE IQA database. Further, we compare the performance of DIIVINE against leading full-reference (FR) IQA algorithms and demonstrate that DIIVINE is statistically superior to the often used measure of peak signal-to-noise ratio (PSNR) and statistically equivalent to the popular structural similarity index (SSIM). A software release of DIIVINE has been made available online: http://live.ece.utexas.edu/research/quality/DIIVINE_release.zip for public use and evaluation.

1,501 citations


"Blind Quality Estimation by Disenta..." refers methods or result in this paper

  • ...C. IQA Reference Schemes Since the research on HDR NR-IQA is still at its preliminary stage and there is no generally accepted benchmark metric, we compared our approach with a number of state-of-theart LDR NR-IQA methods: BRISQUE [7], SSEQ [9], BIQI [6], DIIVINE [8], and kCNN [10], with and without preprocessing operators....

    [...]

  • ...Since the research on HDR NR-IQA is still at its preliminary stage and there is no generally accepted benchmark metric, we compared our approach with a number of state-of-theart LDR NR-IQA methods: BRISQUE [7], SSEQ [9], BIQI [6], DIIVINE [8], and kCNN [10], with and without preprocessing operators....

    [...]

  • ...Examples are Blind/Referenceless Image Spatial QUality Evaluator (BRISQUE) [7], Distortion Identification-based Image Verity and INtegrity Evalutation (DIIVINE) [8] and Spatial-Spectral Entropy based Quality metric (SSEQ) [9]....

    [...]

  • ...DIIVINE [8] uses divisive normalized steerable pyramid decomposition coefficients to create the feature image....

    [...]