scispace - formally typeset
Search or ask a question
Author

Jiheng Wang

Other affiliations: Food and Drug Administration
Bio: Jiheng Wang is an academic researcher from University of Waterloo. The author has contributed to research in topics: Image quality & Stereoscopy. The author has an hindex of 11, co-authored 30 publications receiving 446 citations. Previous affiliations of Jiheng Wang include Food and Drug Administration.

Papers
More filters
Journal ArticleDOI
Jiheng Wang1, Abdul Rehman, Kai Zeng1, Shiqi Wang1, Zhou Wang1 
TL;DR: A binocular rivalry-inspired multi-scale model to predict the quality of stereoscopic images from that of the single-view images is proposed, and the results show that the proposed model successfully eliminates the prediction bias, leading to significantly improved quality prediction of the stereoscope images.
Abstract: Objective quality assessment of distorted stereoscopic images is a challenging problem, especially when the distortions in the left and right views are asymmetric. Existing studies suggest that simply averaging the quality of the left and right views well predicts the quality of symmetrically distorted stereoscopic images, but generates substantial prediction bias when applied to asymmetrically distorted stereoscopic images. In this paper, we first build a database that contains both single-view and symmetrically and asymmetrically distorted stereoscopic images. We then carry out a subjective test, where we find that the quality prediction bias of the asymmetrically distorted images could lean toward opposite directions (overestimate or underestimate), depending on the distortion types and levels. Our subjective test also suggests that eye dominance effect does not have strong impact on the visual quality decisions of stereoscopic images. Furthermore, we develop an information content and divisive normalization-based pooling scheme that improves upon structural similarity in estimating the quality of single-view images. Finally, we propose a binocular rivalry-inspired multi-scale model to predict the quality of stereoscopic images from that of the single-view images. Our results show that the proposed model, without explicitly identifying image distortion types, successfully eliminates the prediction bias, leading to significantly improved quality prediction of the stereoscopic images. 1 1 Some partial preliminary results of this work were presented at International Workshop on Video Processing and Quality Metrics for Consumer Electronics, Chandler, AZ, Jan., 2014. and IEEE International Conference on Multimedia and Expo, Chengdu, China, July, 2014.

138 citations

Journal ArticleDOI
TL;DR: A binocular rivalry inspired model is applied to account for the prediction bias, leading to a significantly improved full reference quality prediction model of stereoscopic videos that allows us to quantitatively predict the coding gain of different variations of asymmetric video compression, and provides new insight on the development of high efficiency 3D video coding schemes.
Abstract: Objective quality assessment of stereoscopic 3D video is challenging but highly desirable, especially in the application of stereoscopic video compression and transmission, where useful quality models are missing, that can guide the critical decision making steps in the selection of mixed-resolution coding, asymmetric quantization, and pre- and post-processing schemes. Here we first carry out subjective quality assessment experiments on two databases that contain various asymmetrically compressed stereoscopic 3D videos obtained from mixed-resolution coding, asymmetric transform-domain quantization coding, their combinations, and the multiple choices of postprocessing techniques. We compare these asymmetric stereoscopic video coding schemes with symmetric coding methods and verify their potential coding gains. We observe a strong systematic bias when using direct averaging of 2D video quality of both views to predict 3D video quality. We then apply a binocular rivalry inspired model to account for the prediction bias, leading to a significantly improved full reference quality prediction model of stereoscopic videos. The model allows us to quantitatively predict the coding gain of different variations of asymmetric video compression, and provides new insight on the development of high efficiency 3D video coding schemes.

37 citations

Journal ArticleDOI
TL;DR: Experiments show that a conceptually simple image classification method, which does not involve any registration, intensity normalization or sophisticated feature extraction processes, and does not rely on any modeling of the image patterns or distortion processes, achieves competitive performance with reduced computational cost.
Abstract: Complex wavelet structural similarity (CW-SSIM) index has been recognized as a novel image similarity measure of broad potential applications due to its robustness to small geometric distortions such as translation, scaling and rotation of images. Nevertheless, how to make the best use of it in image classification problems has not been deeply investigated. In this paper, we introduce a series of novel image classification algorithms based on CW-SSIM and use handwritten digit recognition, and face recognition as examples for demonstration. Among the proposed approaches, the best compromise between accuracy and complexity is obtained by the CW-SSIM support vector machine based algorithms, which combines an unsupervised clustering method to divide the training images into clusters with representative images and a supervised learning method based on support vector machines to maximize the classification accuracy. Our experiments show that such a conceptually simple image classification method, which does not involve any registration, intensity normalization or sophisticated feature extraction processes, and does not rely on any modeling of the image patterns or distortion processes, achieves competitive performance with reduced computational cost.

36 citations

Proceedings ArticleDOI
14 Jul 2014
TL;DR: An information-content and divisive normalization based pooling scheme that improves upon SSIM in estimating the quality of single view images and a binocular rivalry inspired model to predict thequality of stereoscopic images based on that of thesingle view images are proposed.
Abstract: Objective quality assessment of distorted stereoscopic images is a challenging problem. Existing studies suggest that simply averaging the quality of the left- and right-views well predicts the quality of symmetrically distorted stereoscopic images, but generates substantial prediction bias when applied to asymmetrically distorted stereoscopic images. In this study, we first carry out a subjective test, where we find that the prediction bias could lean towards opposite directions, largely depending on the distortion types. We then develop an information-content and divisive normalization based pooling scheme that improves upon SSIM in estimating the quality of single view images. Finally, we propose a binocular rivalry inspired model to predict the quality of stereoscopic images based on that of the single view images. Our results show that the proposed model, without explicitly identifying image distortion types, successfully eliminates the prediction bias, leading to significantly improved quality prediction of stereoscopic images.

33 citations

Proceedings ArticleDOI
03 Dec 2015
TL;DR: It is observed that perceived video quality generally increases with frame rate, but the gain saturates at high rates, and such gain also depends on the interactions between quantization level, spatial resolution, and spatial and motion complexities.
Abstract: High frame rate video has been a hot topic in the past few years driven by a strong need in the entertainment and gaming industry. Nevertheless, progress on perceptual quality assessment of high frame rate video remains limited, making it difficult to evaluate the exact perceptual gain by switching from low to high frame rates. In this work, we first conduct a subjective quality assessment experiment on a database that contains videos compressed at different frame rates, quantization levels and spatial resolutions. We then carry out a series of analysis on the subjective data to investigate the impact of frame rate on perceived video quality and its interplay with quantization level, spatial resolution, spatial complexity, and motion complexity. We observe that perceived video quality generally increases with frame rate, but the gain saturates at high rates. Such gain also depends on the interactions between quantization level, spatial resolution, and spatial and motion complexities.

31 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal ArticleDOI
19 Apr 2016-Test
TL;DR: The present article reviews the most recent theoretical and methodological developments for random forests, with special attention given to the selection of parameters, the resampling mechanism, and variable importance measures.
Abstract: The random forest algorithm, proposed by L. Breiman in 2001, has been extremely successful as a general-purpose classification and regression method. The approach, which combines several randomized decision trees and aggregates their predictions by averaging, has shown excellent performance in settings where the number of variables is much larger than the number of observations. Moreover, it is versatile enough to be applied to large-scale problems, is easily adapted to various ad hoc learning tasks, and returns measures of variable importance. The present article reviews the most recent theoretical and methodological developments for random forests. Emphasis is placed on the mathematical forces driving the algorithm, with special attention given to the selection of parameters, the resampling mechanism, and variable importance measures. This review is intended to provide non-experts easy access to the main ideas.

1,279 citations

Posted Content
TL;DR: A review of the most recent theoretical and methodological developments for random forests can be found in this article, with special attention given to the selection of parameters, the resampling mechanism, and variable importance measures.
Abstract: The random forest algorithm, proposed by L. Breiman in 2001, has been extremely successful as a general-purpose classification and regression method. The approach, which combines several randomized decision trees and aggregates their predictions by averaging, has shown excellent performance in settings where the number of variables is much larger than the number of observations. Moreover, it is versatile enough to be applied to large-scale problems, is easily adapted to various ad-hoc learning tasks, and returns measures of variable importance. The present article reviews the most recent theoretical and methodological developments for random forests. Emphasis is placed on the mathematical forces driving the algorithm, with special attention given to the selection of parameters, the resampling mechanism, and variable importance measures. This review is intended to provide non-experts easy access to the main ideas.

1,119 citations

Journal ArticleDOI

559 citations

Journal ArticleDOI
TL;DR: A new no-reference (NR)/ blind sharpness metric in the autoregressive (AR) parameter space is established via the analysis of AR model parameters, first calculating the energy- and contrast-differences in the locally estimated AR coefficients in a pointwise way, and then quantifying the image sharpness with percentile pooling to predict the overall score.
Abstract: In this paper, we propose a new no-reference (NR)/ blind sharpness metric in the autoregressive (AR) parameter space. Our model is established via the analysis of AR model parameters, first calculating the energy- and contrast-differences in the locally estimated AR coefficients in a pointwise way, and then quantifying the image sharpness with percentile pooling to predict the overall score. In addition to the luminance domain, we further consider the inevitable effect of color information on visual perception to sharpness and thereby extend the above model to the widely used YIQ color space. Validation of our technique is conducted on the subsets with blurring artifacts from four large-scale image databases (LIVE, TID2008, CSIQ, and TID2013). Experimental results confirm the superiority and efficiency of our method over existing NR algorithms, the state-of-the-art blind sharpness/blurriness estimators, and classical full-reference quality evaluators. Furthermore, the proposed metric can be also extended to stereoscopic images based on binocular rivalry, and attains remarkably high performance on LIVE3D-I and LIVE3D-II databases.

296 citations