scispace - formally typeset
Search or ask a question
Author

Manish Narwaria

Bio: Manish Narwaria is an academic researcher from Centre national de la recherche scientifique. The author has contributed to research in topics: Tone mapping & Human visual system model. The author has an hindex of 19, co-authored 41 publications receiving 1828 citations. Previous affiliations of Manish Narwaria include Nanyang Technological University & University of Nantes.

Papers
More filters
Journal ArticleDOI
TL;DR: The main goal of this paper is to shed light on limitations of the current ML-based objective quality predictor approach both from practical and theoretical perspectives wherever applicable, and in the process propose an alternate approach to overcome some of them.
Abstract: Objective assessment of multimedia quality using machine learning (ML) has been gaining popularity especially in the context of both traditional (e.g., terrestrial and satellite broadcast) and advance (such as over-the-top media services, IPTV) broadcast services. Being data-driven, these methods obviously rely on training to find the optimal model parameters. Therefore, to statistically compare and validate such ML-based quality predictors, the current approach randomly splits the given data into training and test sets and obtains a performance measure (for instance mean squared error, correlation coefficient etc.). The process is repeated a large number of times and parametric tests (e.g., ${t}$ test) are then employed to statistically compare mean (or median) prediction accuracies. However, the current approach suffers from a few limitations (related to the qualitative aspects of training and testing data, the use of improper sample size for statistical testing, possibly dependent sample observations, and a lack of focus on quantifying the learning ability of the ML-based objective quality predictor) which have not been addressed in literature. Therefore, the main goal of this paper is to shed light on the said limitations both from practical and theoretical perspectives wherever applicable, and in the process propose an alternate approach to overcome some of them. As a major advantage, the proposed guidelines not only help in a theoretically more grounded statistical comparison but also provide useful insights into how well the ML-based objective quality predictors exploit data structure for learning. We demonstrate the added value of the proposed set of guidelines on standard datasets by comparing the performance of few existing ML-based quality estimators. A software implementation of the presented guidelines is also made publicly available to enable researchers and developers to test and compare different models in a repeatable manner.

8 citations

Proceedings ArticleDOI
TL;DR: The HDR Visual Difference Predictor (HDR-VDP-2) is primarily a visibility prediction metric i.e. whether the signal distortion is visible to the eye and to what extent and it also employs a pooling function to compute an overall quality score.
Abstract: High Dynamic Range (HDR) signals capture much higher contrasts as compared to the traditional 8-bit low dynamic range (LDR) signals. This is achieved by representing the visual signal via values that are related to the real-world luminance, instead of gamma encoded pixel values which is the case with LDR. Therefore, HDR signals cover a larger luminance range and tend to have more visual appeal. However, due to the higher luminance conditions, the existing methods cannot be directly employed for objective quality assessment of HDR signals. For that reason, the HDR Visual Difference Predictor (HDR-VDP-2) has been proposed. HDR-VDP-2 is primarily a visibility prediction metric i.e. whether the signal distortion is visible to the eye and to what extent. Nevertheless, it also employs a pooling function to compute an overall quality score. This paper focuses on the pooling aspect in HDR-VDP-2 and employs a comprehensive database of HDR images (with their corresponding subjective ratings) to improve the prediction accuracy of HDR-VDP-2. We also discuss and evaluate the existing objective methods and provide a perspective towards better HDR quality assessment.

7 citations

Proceedings ArticleDOI
11 Jul 2011
TL;DR: Experiments conducted using two publicly available video databases show the effectiveness of the proposed full-reference metric in comparison to the relevant existing VQA metrics.
Abstract: Objective video quality assessment (VQA) is the use of computational models to predict the video quality in line with the perception of the human visual system (HVS). It is challenging due to the underlying complexity, and the relatively limited understanding of the HVS and its intricate mechanisms. There are two important issues regarding VQA: (a) the temporal factors apart from the spatial ones also need to be considered, (b) the contribution of each factor and their interaction to the overall video quality needs to be determined. In this paper, we attempt to tackle the first issue by utilizing the variation of spatial quality along the temporal axis. The second issue is addressed by the use of machine learning; we believe this to be more convincing since the relationship between the factors and the overall quality is derived via training with substantial ground truth (i.e. subjective scores). Experiments conducted using two publicly available video databases show the effectiveness of the proposed full-reference metric in comparison to the relevant existing VQA metrics.

6 citations

Proceedings ArticleDOI
TL;DR: This work investigates into a new objective method for TMO parameters optimization based on quantification of contrast reversal and naturalness that does not require any prior knowledge about the input HDR image and works independently on the used TMO.
Abstract: Dynamic range compression (or tone mapping) of HDR content is an essential step towards rendering it on traditional LDR displays in a meaningful way. This is however non-trivial and one of the reasons is that tone mapping operators (TMOs) usually need content-specific parameters to achieve the said goal. While subjective TMO parameter adjustment is the most accurate, it may not be easily deployable in many practical applications. Its subjective nature can also influence the comparison of different operators. Thus, there is a need for objective TMO parameter selection to automate the rendering process. To that end, we investigate into a new objective method for TMO parameters optimization. Our method is based on quantification of contrast reversal and naturalness. As an important advantage, it does not require any prior knowledge about the input HDR image and works independently on the used TMO. Experimental results using a variety of HDR images and several popular TMOs demonstrate the value of our method in comparison to default TMO parameter settings.

6 citations

Proceedings ArticleDOI
TL;DR: This paper presents the universal method for TMO parameters tuning, in order to maintain as many details as possible, which is desirable in security applications, and suggests possible increase in privacy intrusion.
Abstract: High Dynamic Range (HDR) imaging has been gaining popularity in recent years. Different from the traditional low dynamic range (LDR), HDR content tends to be visually more appealing and realistic as it can represent the dynamic range of the visual stimuli present in the real world. As a result, more scene details can be faithfully reproduced. As a direct consequence, the visual quality tends to improve. HDR can be also directly exploited for new applications such as video surveillance and other security tasks. Since more scene details are available in HDR, it can help in identifying/tracking visual information which otherwise might be difficult with typical LDR content due to factors such as lack/excess of illumination, extreme contrast in the scene, etc. On the other hand, with HDR, there might be issues related to increased privacy intrusion. To display the HDR content on the regular screen, tone-mapping operators (TMO) are used. In this paper, we present the universal method for TMO parameters tuning, in order to maintain as many details as possible, which is desirable in security applications. The method’s performance is verified on several TMOs by comparing the outcomes from tone-mapping with default and optimized parameters. The results suggest that the proposed approach preserves more information which could be of advantage for security surveillance but, on the other hand, makes us consider possible increase in privacy intrusion.

6 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Despite its simplicity, it is able to show that BRISQUE is statistically better than the full-reference peak signal-to-noise ratio and the structural similarity index, and is highly competitive with respect to all present-day distortion-generic NR IQA algorithms.
Abstract: We propose a natural scene statistic-based distortion-generic blind/no-reference (NR) image quality assessment (IQA) model that operates in the spatial domain. The new model, dubbed blind/referenceless image spatial quality evaluator (BRISQUE) does not compute distortion-specific features, such as ringing, blur, or blocking, but instead uses scene statistics of locally normalized luminance coefficients to quantify possible losses of “naturalness” in the image due to the presence of distortions, thereby leading to a holistic measure of quality. The underlying features used derive from the empirical distribution of locally normalized luminances and products of locally normalized luminances under a spatial natural scene statistic model. No transformation to another coordinate frame (DCT, wavelet, etc.) is required, distinguishing it from prior NR IQA approaches. Despite its simplicity, we are able to show that BRISQUE is statistically better than the full-reference peak signal-to-noise ratio and the structural similarity index, and is highly competitive with respect to all present-day distortion-generic NR IQA algorithms. BRISQUE has very low computational complexity, making it well suited for real time applications. BRISQUE features may be used for distortion-identification as well. To illustrate a new practical application of BRISQUE, we describe how a nonblind image denoising algorithm can be augmented with BRISQUE in order to perform blind image denoising. Results show that BRISQUE augmentation leads to performance improvements over state-of-the-art methods. A software release of BRISQUE is available online: http://live.ece.utexas.edu/research/quality/BRISQUE_release.zip for public use and evaluation.

3,780 citations

Journal ArticleDOI
TL;DR: It is found that the pixel-wise gradient magnitude similarity (GMS) between the reference and distorted images combined with a novel pooling strategy-the standard deviation of the GMS map-can predict accurately perceptual image quality.
Abstract: It is an important task to faithfully evaluate the perceptual quality of output images in many applications, such as image compression, image restoration, and multimedia streaming. A good image quality assessment (IQA) model should not only deliver high quality prediction accuracy, but also be computationally efficient. The efficiency of IQA metrics is becoming particularly important due to the increasing proliferation of high-volume visual data in high-speed networks. We present a new effective and efficient IQA model, called gradient magnitude similarity deviation (GMSD). The image gradients are sensitive to image distortions, while different local structures in a distorted image suffer different degrees of degradations. This motivates us to explore the use of global variation of gradient based local quality map for overall image quality prediction. We find that the pixel-wise gradient magnitude similarity (GMS) between the reference and distorted images combined with a novel pooling strategy-the standard deviation of the GMS map-can predict accurately perceptual image quality. The resulting GMSD algorithm is much faster than most state-of-the-art IQA methods, and delivers highly competitive prediction accuracy. MATLAB source code of GMSD can be downloaded at http://www4.comp.polyu.edu.hk/~cslzhang/IQA/GMSD/GMSD.htm.

1,211 citations

Journal ArticleDOI
TL;DR: A systematic, comprehensive and up-to-date review of perceptual visual quality metrics (PVQMs) to predict picture quality according to human perception.

895 citations

Journal ArticleDOI
TL;DR: Extensive experiments performed on four largescale benchmark databases demonstrate that the proposed IQA index VSI works better in terms of the prediction accuracy than all state-of-the-art IQA indices the authors can find while maintaining a moderate computational complexity.
Abstract: Perceptual image quality assessment (IQA) aims to use computational models to measure the image quality in consistent with subjective evaluations. Visual saliency (VS) has been widely studied by psychologists, neurobiologists, and computer scientists during the last decade to investigate, which areas of an image will attract the most attention of the human visual system. Intuitively, VS is closely related to IQA in that suprathreshold distortions can largely affect VS maps of images. With this consideration, we propose a simple but very effective full reference IQA method using VS. In our proposed IQA model, the role of VS is twofold. First, VS is used as a feature when computing the local quality map of the distorted image. Second, when pooling the quality score, VS is employed as a weighting function to reflect the importance of a local region. The proposed IQA index is called visual saliency-based index (VSI). Several prominent computational VS models have been investigated in the context of IQA and the best one is chosen for VSI. Extensive experiments performed on four large-scale benchmark databases demonstrate that the proposed IQA index VSI works better in terms of the prediction accuracy than all state-of-the-art IQA indices we can find while maintaining a moderate computational complexity. The MATLAB source code of VSI and the evaluation results are publicly available online at http://sse.tongji.edu.cn/linzhang/IQA/VSI/VSI.htm.

823 citations

Posted Content
TL;DR: In this article, a gradient magnitude similarity deviation (GMSD) method was proposed for image quality assessment, where the pixel-wise GMS between the reference and distorted images was combined with a novel pooling strategy to predict accurately perceptual image quality.
Abstract: It is an important task to faithfully evaluate the perceptual quality of output images in many applications such as image compression, image restoration and multimedia streaming. A good image quality assessment (IQA) model should not only deliver high quality prediction accuracy but also be computationally efficient. The efficiency of IQA metrics is becoming particularly important due to the increasing proliferation of high-volume visual data in high-speed networks. We present a new effective and efficient IQA model, called gradient magnitude similarity deviation (GMSD). The image gradients are sensitive to image distortions, while different local structures in a distorted image suffer different degrees of degradations. This motivates us to explore the use of global variation of gradient based local quality map for overall image quality prediction. We find that the pixel-wise gradient magnitude similarity (GMS) between the reference and distorted images combined with a novel pooling strategy the standard deviation of the GMS map can predict accurately perceptual image quality. The resulting GMSD algorithm is much faster than most state-of-the-art IQA methods, and delivers highly competitive prediction accuracy.

742 citations