Information Content Weighting for Perceptual Image Quality Assessment
Summary (3 min read)
I. INTRODUCTION
- I N RECENT years, there has been an increasing interest in developing objective image quality assessment (IQA) methods that can automatically predict human behaviors in evaluating image quality [1] - [3] .
- Spatial domain methods such as the mean squared error (MSE) and the structural similarity (SSIM) index [4] , [5] compute pixelor patch-wise distortion/quality measures in space, while block-discrete cosine transform [6] and wavelet-based [7] - [11] approaches define localized quality/distortion measures across scale, space and orientation.
- This is supported by a number of interesting recent studies [14] - [16] , where it has been shown that sizable performance gain can be obtained by combining objective local quality measures with subjective human fixation or region-of-interest detection data.
- The existing pooling approaches can be roughly categorized in the following ways.
• Local quality/distortion-based pooling
- The intuitive idea that more emphasis should be put at high distortion regions can be implemented in a more straightforward way by local qulaity/distoriton-based pooling.
- This can be done by using a nonuniform weighting approach, where the weight may be determined by an error visibility detection map [17] .
- It may also be computed using the local quality/distortion measure itself [13] , such that the overall quality/distortion measure is given by (2) where the weighting function is monotonically increasing when is a distortion measure (i.e., larger value indicates higher distortion), and monotonically decreasing when is a quality measure (i.e., larger value indicates higher quality).
- Another method to assign more weights to low quality regions is to sort all values and use a small percentile of them that correspond to the lowest quality regions.
- Local quality/distortion-based pooling has been shown to be effective in improving IQA performance, as reported in [13] , [19] , though the implementations are often heuristic (for example, in the selection of the weighting function and the percentile), without theoretical guiding principles.
• Saliency-based pooling
- Here the authors use "saliency" as a general term that represents low-level local image features that are of perceptual significance (as opposed to high-level components such as human faces).
- The motivation behind saliency-based pooling approaches is that visual attention is attracted to distinctive saliency features and, thus, more importance should be given to the associated regions in the image.
- This can range from simple features such as local variance [13] or contrast [20] to sophisticated computational models based upon automatic point of gaze predictions from low-level vision features [19] , [21] - [24] .
- It has also been found that motion information is another useful feature to use in the pooling stage of video quality assessment algorithms [25] - [27] .
• Object-based pooling
- Different from low-level vision based saliency approaches, object-based pooling methods resort to high-level cognitive vision based image understanding algorithms that help detect and/or segment significant regions from the image.
- What are lacking are not heuristic tricks but general theoretical principles that are not only qualitative sensible but also quantitative manageable, so that reliable computational models for pooling can be derived.
- In essence, their approach is saliency-based, but the resulting weighting function also has interesting connections with quality/distortion-based pooling method, which the authors will discuss later in Section II.
- Information theoretic methods are by no means new for IQA.
- In fact, their work is inspired by the success of the visual information fidelity (VIF) method [34] , though VIF was not originally proposed for pooling purpose.
II. INFORMATION CONTENT WEIGHTING
- The computation of image information content relies on good statistical image models.
- The remaining task is, thus, the statistical modeling of groups of neighboring pixels (or coefficients).
- To simplify the computation, the authors assume that only takes a fixed value at each location (but varies over space and scale).
- This was demonstrated empirically in [34] using an image synthesis ap- proach, where images under different types of distortions were compared with synthesized distortion images using the local attenuation/noise model.
- As a result, the mutual information evaluations, and , can be calculated based upon the determinants of the covariances [41] by (13) (14) (15) where ( 16) (17) (18) Equation ( 16) can be simplified based upon the fact that (19) where is the expectation operator and the authors have used the fact that and are independent.
A. Information Content Weighted PSNR
- Let and be the th pixel in the original image and the distorted image , respectively.
- The MSE and PSNR between the two images are given by MSE (34) PSNR MSE (35) where is the total number of pixels in the image and is the maximum dynamic range.
- Here the authors define an information content weighted MSE (IW-MSE) and an information content weighted PSNR (IW-PSNR) measures by incorporating the Laplacican pyramid transform [40] domain information content weights computed as in (28) .
B. Information Content Weighted MultiScale SSIM
- The basic spatial domain SSIM algorithm [5] is based upon separated comparisons of local luminance, contrast and structure between an original and a distorted images.
- Here, and represent the mean, standard deviation and cross-correlation evaluations, respectively.
- It has been found that the performance of the previous single-scale SSIM algorithm depends upon the scale it is applied to [42] and [43] .
- Interestingly, the measured weight function peaks at middle-resolution scales and drops at both low-and high-resolution scales, consistent with the contrast sensitivity function extensively studied in the vision literature [12] .
- The final overall IW-SSIM measure is then computed as (47) using the same set of scale weights 's as in MS-SSIM.
C. Interpretation of VIF Based Upon Information Content Weighting
- Based upon the interpretation in its original publication, the VIF algorithm [34] does not seem to fit into the two-stage framework shown in Fig. 1 , because the information content is summed over the entire image space before the fidelity ratio is computed VIF (48).
- Here the authors show that with some simple transformations, VIF indeed can be nicely interpreted using the same two-stage framework.
- Specifically, the authors can write VIF VIF (49) where they have defined a local VIF measure (which follows the same philosophy as the general VIF concept [34] ) EQUATION ) and a weighting function (51) Interestingly, this weight definition is essentially an information content measure, although different from what they use in their approach [as in (12) ].
Did you find this useful? Give us your feedback
Citations
4,028 citations
1,758 citations
Cites methods from "Information Content Weighting for P..."
...One of these is the Information Weigthed SSIM (IW-SSIM), a modification of MS-SSIM that also includes a weighting scheme proportional to the local image information [20]....
[...]
1,211 citations
Cites methods from "Information Content Weighting for P..."
...With the gradient magnitude images mr and md in hand, the gradient magnitude similarity (GMS) map is computed as follows: 2 2 2 ( ) ( ) ( ) ( ) ( ) r d r d i i c GMS i i i c m m m m (4) where c is a positive constant that supplies numerical stability, L is the range of the image intensity....
[...]
...Let’s use some examples to analyze the GMS induced LQM....
[...]
943 citations
Cites methods from "Information Content Weighting for P..."
...Correspondence to HVS has been evaluated for the following metrics (quality indices): SFF [44], componentwise FSIM and its color version FSIMc [20], PSNR-HA and PSNR-HMA [43], SR-SIM [45], MSSIM [46], MAD index [27], IW-SSIM [19], MSDDM [47], IW-PSNR [19], color version of PSNR which takes into account color in a manner similar to PSNR-HA [43], VSNR [48], PSNR-HVS [49], PSNR-HVS-M [40], SSIM [9], NQM [50], DCTune [51], VIF and a pixel based version of VIF (VIFP) [52], UQI [53], WSNR [54], CWSSIM [55], XYZ [56], LINLAB [57], IFC [58], BMMF [59]....
[...]
942 citations
Cites methods from "Information Content Weighting for P..."
...When reference images are available, Full Reference (FR) IQA methods [14, 22, 16, 17, 19] can be ap-...
[...]
References
45,034 citations
40,609 citations
10,525 citations
8,566 citations
6,975 citations