scispace - formally typeset
Search or ask a question
Author

C.-C. Jay Kuo

Bio: C.-C. Jay Kuo is an academic researcher from University of Southern California. The author has contributed to research in topics: Computer science & Wavelet. The author has an hindex of 59, co-authored 955 publications receiving 16671 citations. Previous affiliations of C.-C. Jay Kuo include Ningbo University & Beihang University.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper describes a recently created image database, TID2013, intended for evaluation of full-reference visual quality assessment metrics, and methodology for determining drawbacks of existing visual quality metrics is described.
Abstract: This paper describes a recently created image database, TID2013, intended for evaluation of full-reference visual quality assessment metrics. With respect to TID2008, the new database contains a larger number (3000) of test images obtained from 25 reference images, 24 types of distortions for each reference image, and 5 levels for each type of distortion. Motivations for introducing 7 new types of distortions and one additional level of distortions are given; examples of distorted images are presented. Mean opinion scores (MOS) for the new database have been collected by performing 985 subjective experiments with volunteers (observers) from five countries (Finland, France, Italy, Ukraine, and USA). The availability of MOS allows the use of the designed database as a fundamental tool for assessing the effectiveness of visual quality. Furthermore, existing visual quality metrics have been tested with the proposed database and the collected results have been analyzed using rank order correlation coefficients between MOS and considered metrics. These correlation indices have been obtained both considering the full set of distorted images and specific image subsets, for highlighting advantages and drawbacks of existing, state of the art, quality metrics. Approaches to thorough performance analysis for a given metric are presented to detect practical situations or distortion types for which this metric is not adequate enough to human perception. The created image database and the collected MOS values are freely available for downloading and utilization for scientific purposes. We have created a new large database.This database contains larger number of distorted images and distortion types.MOS values for all images are obtained and provided.Analysis of correlation between MOS and a wide set of existing metrics is carried out.Methodology for determining drawbacks of existing visual quality metrics is described.

943 citations

Journal ArticleDOI
TL;DR: A systematic, comprehensive and up-to-date review of perceptual visual quality metrics (PVQMs) to predict picture quality according to human perception.

895 citations

Journal ArticleDOI
TL;DR: A heuristic rule-based procedure is proposed to segment and classify audio signals and built upon morphological and statistical analysis of the time-varying functions of these audio features.
Abstract: While current approaches for audiovisual data segmentation and classification are mostly focused on visual cues, audio signals may actually play a more important role in content parsing for many applications. An approach to automatic segmentation and classification of audiovisual data based on audio content analysis is proposed. The audio signal from movies or TV programs is segmented and classified into basic types such as speech, music, song, environmental sound, speech with music background, environmental sound with music background, silence, etc. Simple audio features including the energy function, the average zero-crossing rate, the fundamental frequency, and the spectral peak tracks are extracted to ensure the feasibility of real-time processing. A heuristic rule-based procedure is proposed to segment and classify audio signals and built upon morphological and statistical analysis of the time-varying functions of these audio features. Experimental results show that the proposed scheme achieves an accuracy rate of more than 90% in audio classification.

473 citations

Proceedings Article
10 Jun 2013
TL;DR: A new database of color images with various sets of distortions called TID2013 is presented that contains a larger number of images and seven new types and one more level of distortions are included.
Abstract: Visual quality of color images is an important aspect in various applications of digital image processing and multimedia. A large number of visual quality metrics (indices) has been proposed recently. In order to assess their reliability, several databases of color images with various sets of distortions have been exploited. Here we present a new database called TID2013 that contains a larger number of images. Compared to its predecessor TID2008, seven new types and one more level of distortions are included. The need for considering these new types of distortions is briefly described. Besides, preliminary results of experiments with a large number of volunteers for determining the mean opinion score (MOS) are presented. Spearman and Kendall rank order correlation factors between MOS and a set of popular metrics are calculated and presented. Their analysis shows that adequateness of the existing metrics is worth improving. Special attention is to be paid to accounting for color information and observers focus of attention to locally active areas in images.

446 citations

Journal ArticleDOI
TL;DR: An efficient method is proposed to obtain a good initial codebook that can accelerate the convergence of the generalized Lloyd algorithm and achieve a better local minimum as well.
Abstract: The generalized Lloyd algorithm plays an important role in the design of vector quantizers (VQ) and in feature clustering for pattern recognition. In the VQ context, this algorithm provides a procedure to iteratively improve a codebook and results in a local minimum that minimizes the average distortion function. We propose an efficient method to obtain a good initial codebook that can accelerate the convergence of the generalized Lloyd algorithm and achieve a better local minimum as well. >

374 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

40,609 citations

Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal ArticleDOI
TL;DR: It is proved the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density.
Abstract: A general non-parametric technique is proposed for the analysis of a complex multimodal feature space and to delineate arbitrarily shaped clusters in it. The basic computational module of the technique is an old pattern recognition procedure: the mean shift. For discrete data, we prove the convergence of a recursive mean shift procedure to the nearest stationary point of the underlying density function and, thus, its utility in detecting the modes of the density. The relation of the mean shift procedure to the Nadaraya-Watson estimator from kernel regression and the robust M-estimators; of location is also established. Algorithms for two low-level vision tasks discontinuity-preserving smoothing and image segmentation - are described as applications. In these algorithms, the only user-set parameter is the resolution of the analysis, and either gray-level or color images are accepted as input. Extensive experimental results illustrate their excellent performance.

11,727 citations

Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations