scispace - formally typeset
Search or ask a question
Author

Manish Narwaria

Bio: Manish Narwaria is an academic researcher from Centre national de la recherche scientifique. The author has contributed to research in topics: Tone mapping & Human visual system model. The author has an hindex of 19, co-authored 41 publications receiving 1828 citations. Previous affiliations of Manish Narwaria include Nanyang Technological University & University of Nantes.

Papers
More filters
Journal ArticleDOI
TL;DR: This paper investigates image features based on two-dimensional mel-cepstrum for the purpose of IQA and proposes a new metric by formulating IQA as a pattern recognition problem, which helps to overcome the limitations of the existing pooling methods.

40 citations

Journal ArticleDOI
TL;DR: This paper proposes a nonintrusive metric for the quality assessment of noise-suppressed speech and utilizes the sensitivity of FBEs to noise in order to obtain an effective representation of speech towards quality assessment.
Abstract: Objective speech quality assessment is a challenging task which aims to emulate human judgment in the complex and time consuming task of subjective assessment. It is difficult to perform in line with the human perception due the complex and nonlinear nature of the human auditory system. The challenge lies in representing speech signals using appropriate features and subsequently mapping these features into a quality score. This paper proposes a nonintrusive metric for the quality assessment of noise-suppressed speech. The originality of the proposed approach lies primarily in the use of Mel filter bank energies (FBEs) as features and the use of support vector regression (SVR) for feature mapping. We utilize the sensitivity of FBEs to noise in order to obtain an effective representation of speech towards quality assessment. In addition, the use of SVR exploits the advantages of kernels which allow the regression algorithm to learn complex data patterns via nonlinear transformation for an effective and generalized mapping of features into the quality score. Extensive experiments conducted using two third party databases with different noise-suppressed speech signals show the effectiveness of the proposed approach.

39 citations

Journal ArticleDOI
TL;DR: It is believed that VA needs consideration for evaluating the overall perceptual impact of TMOs on HDR content, since the existing studies so far have only considered the quality or esthetic appeal angle.
Abstract: High Dynamic Range (HDR) content is visually more appealing since it can represent the real luminance of the scene. However, on the downside, this means that a large amount of data needs to be handled both during storage and processing. The other problem is that HDR content cannot be displayed on the conventional display devices due to their limited dynamic range. To overcome these two problems, dynamic range compression (or range reduction) is often used and this is accomplished by tone mapping operators (TMOs). As result of tone mapping, the HDR content is not only fit to be displayed on a regular display device but also compressed. However from an artistic intention point of view, TMOs are not necessarily transparent and might induce different viewing behavior. It is generally accepted that TMOs reduce visual quality and there have been a number of studies reported in literature which examine the impact of tone mapping from the view point of perceptual quality. In contrast to this, it is largely unclear if tone mapping will induce changes in visual attention (VA) as well and whether these are significant enough to be accounted for in HDR content processing. To our knowledge, no systematic study exists which sheds light on this issue. Given that VA is a crucial visual perception mechanism which affects the way we perceive visual signals, it is important to study the effect of tone mapping on VA deployment. Towards this goal, this paper investigates and quantifies how TMOs modify VA. Comprehensive subjective tests in the form of eye-tracking experiments have been conducted on several HDR content and using a large number of TMOs. Further non-parametric statistical analysis has been carried out to ascertain the statistical significance of the results obtained. Our studies suggest that TMOs can indeed modify human attention and fixation behavior. Based on this we believe that VA needs consideration for evaluating the overall perceptual impact of TMOs on HDR content. As mentioned, since the existing studies so far have only considered the quality or esthetic appeal angle, this study brings in a new perspective regarding the importance of VA in HDR content processing for visualization on LDR displays.

36 citations

Journal ArticleDOI
TL;DR: The utility of the ideas are demonstrated by considering the practical scenario of video broadcast transmissions with focus on digital terrestrial television (DTT) and proposing a no-reference objective video quality estimator for such application and conducting meaningful verification studies on different video content to verify the performance of the proposed solution.
Abstract: Video quality measurement is an important component in the end-to-end video delivery chain. Video quality is, however, subjective, and thus, there will always be interobserver differences in the subjective opinion about the visual quality of the same video. Despite this, most existing works on objective quality measurement typically focus only on predicting a single score and evaluate their prediction accuracies based on how close it is to the mean opinion scores (or similar average based ratings). Clearly, such an approach ignores the underlying diversities in the subjective scoring process and, as a result, does not allow further analysis on how reliable the objective prediction is in terms of subjective variability. Consequently, the aim of this paper is to analyze this issue and present a machine-learning based solution to address it. We demonstrate the utility of our ideas by considering the practical scenario of video broadcast transmissions with focus on digital terrestrial television (DTT) and proposing a no-reference objective video quality estimator for such application. We conducted meaningful verification studies on different video content (including video clips recorded from real DTT broadcast transmissions) in order to verify the performance of the proposed solution.

33 citations

Book ChapterDOI
06 Jan 2010
TL;DR: Experimental results indicate that the proposed approach outperforms the standard P.563 algorithm for non-intrusive assessment of speech quality with a total of 1792 speech files and the associated subjective scores.
Abstract: We propose a new non-intrusive speech quality assessment algorithm based on Support Vector Regression (SVR) and Mel Frequency Cepstral Coefficients (MFCCs). The basic idea is to map the MFCCs into the desired quality score using SVR. The sensitivity of the MFCCs to external noise is exploited to gauge the changes in the speech signal to evaluate its perceptual quality. The use of SVR exploits the advantages of machine learning with the ability to learn complex data patterns for an effective and generalized mapping of features into a perceptual score, in contrast with the oft-utilized feature pooling process in the existing speech quality estimators. Experimental results indicate that the proposed approach outperforms the standard P.563 algorithm for non-intrusive assessment of speech quality with a total of 1792 speech files and the associated subjective scores.

28 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Despite its simplicity, it is able to show that BRISQUE is statistically better than the full-reference peak signal-to-noise ratio and the structural similarity index, and is highly competitive with respect to all present-day distortion-generic NR IQA algorithms.
Abstract: We propose a natural scene statistic-based distortion-generic blind/no-reference (NR) image quality assessment (IQA) model that operates in the spatial domain. The new model, dubbed blind/referenceless image spatial quality evaluator (BRISQUE) does not compute distortion-specific features, such as ringing, blur, or blocking, but instead uses scene statistics of locally normalized luminance coefficients to quantify possible losses of “naturalness” in the image due to the presence of distortions, thereby leading to a holistic measure of quality. The underlying features used derive from the empirical distribution of locally normalized luminances and products of locally normalized luminances under a spatial natural scene statistic model. No transformation to another coordinate frame (DCT, wavelet, etc.) is required, distinguishing it from prior NR IQA approaches. Despite its simplicity, we are able to show that BRISQUE is statistically better than the full-reference peak signal-to-noise ratio and the structural similarity index, and is highly competitive with respect to all present-day distortion-generic NR IQA algorithms. BRISQUE has very low computational complexity, making it well suited for real time applications. BRISQUE features may be used for distortion-identification as well. To illustrate a new practical application of BRISQUE, we describe how a nonblind image denoising algorithm can be augmented with BRISQUE in order to perform blind image denoising. Results show that BRISQUE augmentation leads to performance improvements over state-of-the-art methods. A software release of BRISQUE is available online: http://live.ece.utexas.edu/research/quality/BRISQUE_release.zip for public use and evaluation.

3,780 citations

Journal ArticleDOI
TL;DR: It is found that the pixel-wise gradient magnitude similarity (GMS) between the reference and distorted images combined with a novel pooling strategy-the standard deviation of the GMS map-can predict accurately perceptual image quality.
Abstract: It is an important task to faithfully evaluate the perceptual quality of output images in many applications, such as image compression, image restoration, and multimedia streaming. A good image quality assessment (IQA) model should not only deliver high quality prediction accuracy, but also be computationally efficient. The efficiency of IQA metrics is becoming particularly important due to the increasing proliferation of high-volume visual data in high-speed networks. We present a new effective and efficient IQA model, called gradient magnitude similarity deviation (GMSD). The image gradients are sensitive to image distortions, while different local structures in a distorted image suffer different degrees of degradations. This motivates us to explore the use of global variation of gradient based local quality map for overall image quality prediction. We find that the pixel-wise gradient magnitude similarity (GMS) between the reference and distorted images combined with a novel pooling strategy-the standard deviation of the GMS map-can predict accurately perceptual image quality. The resulting GMSD algorithm is much faster than most state-of-the-art IQA methods, and delivers highly competitive prediction accuracy. MATLAB source code of GMSD can be downloaded at http://www4.comp.polyu.edu.hk/~cslzhang/IQA/GMSD/GMSD.htm.

1,211 citations

Journal ArticleDOI
TL;DR: A systematic, comprehensive and up-to-date review of perceptual visual quality metrics (PVQMs) to predict picture quality according to human perception.

895 citations

Journal ArticleDOI
TL;DR: Extensive experiments performed on four largescale benchmark databases demonstrate that the proposed IQA index VSI works better in terms of the prediction accuracy than all state-of-the-art IQA indices the authors can find while maintaining a moderate computational complexity.
Abstract: Perceptual image quality assessment (IQA) aims to use computational models to measure the image quality in consistent with subjective evaluations. Visual saliency (VS) has been widely studied by psychologists, neurobiologists, and computer scientists during the last decade to investigate, which areas of an image will attract the most attention of the human visual system. Intuitively, VS is closely related to IQA in that suprathreshold distortions can largely affect VS maps of images. With this consideration, we propose a simple but very effective full reference IQA method using VS. In our proposed IQA model, the role of VS is twofold. First, VS is used as a feature when computing the local quality map of the distorted image. Second, when pooling the quality score, VS is employed as a weighting function to reflect the importance of a local region. The proposed IQA index is called visual saliency-based index (VSI). Several prominent computational VS models have been investigated in the context of IQA and the best one is chosen for VSI. Extensive experiments performed on four large-scale benchmark databases demonstrate that the proposed IQA index VSI works better in terms of the prediction accuracy than all state-of-the-art IQA indices we can find while maintaining a moderate computational complexity. The MATLAB source code of VSI and the evaluation results are publicly available online at http://sse.tongji.edu.cn/linzhang/IQA/VSI/VSI.htm.

823 citations

Posted Content
TL;DR: In this article, a gradient magnitude similarity deviation (GMSD) method was proposed for image quality assessment, where the pixel-wise GMS between the reference and distorted images was combined with a novel pooling strategy to predict accurately perceptual image quality.
Abstract: It is an important task to faithfully evaluate the perceptual quality of output images in many applications such as image compression, image restoration and multimedia streaming. A good image quality assessment (IQA) model should not only deliver high quality prediction accuracy but also be computationally efficient. The efficiency of IQA metrics is becoming particularly important due to the increasing proliferation of high-volume visual data in high-speed networks. We present a new effective and efficient IQA model, called gradient magnitude similarity deviation (GMSD). The image gradients are sensitive to image distortions, while different local structures in a distorted image suffer different degrees of degradations. This motivates us to explore the use of global variation of gradient based local quality map for overall image quality prediction. We find that the pixel-wise gradient magnitude similarity (GMS) between the reference and distorted images combined with a novel pooling strategy the standard deviation of the GMS map can predict accurately perceptual image quality. The resulting GMSD algorithm is much faster than most state-of-the-art IQA methods, and delivers highly competitive prediction accuracy.

742 citations