Multiscale structural similarity for image quality assessment
read more
Citations
Image Super-Resolution Using Deep Convolutional Networks
FSIM: A Feature Similarity Index for Image Quality Assessment
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
No-Reference Image Quality Assessment in the Spatial Domain
Making a “Completely Blind” Image Quality Analyzer
References
Image quality assessment: from error visibility to structural similarity
A universal image quality index
A Parametric Texture Model Based on Joint Statistics of Complex Wavelet Coefficients
Image quality measures and their performance
Foundations of vision
Related Papers (5)
Frequently Asked Questions (23)
Q2. What is the logistic function used in the VQEG Phase The authorFR-TV test?
To provide quantitative performance evaluation, the authors use the logistic function adopted in the video quality experts group (VQEG) Phase The authorFR-TV test [15] to provide a non-linear mapping between the objective and subjective scores.
Q3. What are the measures of prediction accuracy?
After the non-linear mapping, the linear correlation coefficient (CC), the mean absolute error (MAE), and the root mean squared error (RMS) between the subjective and objective scores are calculated as measures of prediction accuracy.
Q4. What is the drawback of a parameter setting?
The drawback of such a parameter setting is that when the denominator of Eq. (6) is close to 0, the resulting measurement becomes unstable.
Q5. What is the main purpose of the paper?
In this paper, the authors used an image synthesis approach to calibrate the parameters that define the relative importance between scales.
Q6. What is the way to evaluate the performance of a SSIM model?
In the development of top-down image quality models (such as structural similarity based algorithms), one of the most challenging problems is to calibrate the model parameters, which are rather “abstract” and cannot be directly derived from simple-stimulus subjective experiments as in the bottom-up models.
Q7. What are the widely used image quality and distortion assessment algorithms?
The most widely used full-reference image quality and distortion assessment algorithms are peak signal-to-noise ratio (PSNR) and mean squared error (MSE), which do not correlate well with perceived quality (e.g., [1]–[6]).
Q8. What is the purpose of this paper?
In this paper, the authors propose a multi-scale structural similarity method and introduce a novel image synthesis-based approach to calibrate the parameters that weight the relative importance between different scales.
Q9. What is the bit rate of the test images?
The bit rate ranges from 0.028 to 3.150 bits/pixel, which allows the test images to cover a wide quality range, from indistinguishable from the original image to highly distorted.
Q10. What scales are used to compute contrast comparisons?
At the j-th scale, the contrast comparison (2) and the structure comparison (3) are calculated and denoted as cj(x,y) and sj(x,y), respectively.
Q11. What is the SSIM index definition for scale M?
In particular, a single-scale implementation for Scale M applies the iterative filtering and downsampling procedure up to Scale M and only the exponents αM , βM and γM are given nonzero values.
Q12. How many distorted images are used in this experiment?
The authors employ 10 original 64×64 images with different types of content (human faces, natural scenes, plants, man-made objects, etc.) in their experiment to create 10 sets of distorted images (a total of 600 distorted images).
Q13. What is the definition of objective image quality assessment?
It is based on a top-down assumption that the HVS is highly adapted for extracting structural information from the scene, and therefore a measure of structural similarity should be a good approximation of perceived image quality.
Q14. How many scales are used in the experiment?
The authors use 5 scales and 12 distortion levels (range from 23 to 214) in their experiment, resulting in a total of 60 images, as demonstrated in Fig.
Q15. Why is the distorted image more similar to the original?
This is not surprising because image coding techniques such as JPEG and JPEG2000 usually compress fine-scale details to a much higher degree than coarse-scale structures, and thus the distorted image “looks” more similar to the original image if evaluated at larger scales.
Q16. How does the system compare the image at different scales?
1. Taking the reference and distorted image signals as the input, the system iteratively applies a low-pass filter and downsamples the filtered image by a factor of 2.
Q17. What is the drawback of the method?
The authors consider this a drawback of the method because the right scale depends on viewing conditions (e.g., display resolution and viewing distance).
Q18. what is the general form of the SSIM index between x and y?
The general form of the Structural SIMilarity (SSIM) index between signal x and y is defined as:SSIM(x,y) = [l(x,y)]α · [c(x,y)]β · [s(x,y)]γ , (5) where α, β and γ are parameters to define the relative importance of the three components.
Q19. What is the SSIM evaluation of the image?
The overall SSIM evaluation is obtained by combining the measurement at different scales usingSSIM(x,y) = [lM (x,y)] αM · M∏ j=1 [cj(x,y)] βj [sj(x,y)] γj .
Q20. What is the performance of the SSIM model?
From both the scatter plots and the quantitative evaluation results, the authors see that the performance of single-scale SSIM model varies with scales and the best performance is given by the case of M=2.
Q21. what is the scalarity of x and y?
µx and σx can be viewed as estimates of the luminance and contrast of x, and σxy measures the the tendency of x and y to vary together, thus an indication of structural similarity.
Q22. what is the dynamic range of the pixel values?
L is the dynamic range of the pixel values (L = 255 for 8 bits/pixel gray scale images), and K1 ¿ 1 and K2 ¿ 1 are two scalar constants.
Q23. How many images are available in the LIVE database?
The authors test a number of image quality assessment algorithms using the LIVE database (available at [13]), which includes 344 JPEG and JPEG2000 compressed images (typically 768×512 or similar size).