scispace - formally typeset
Search or ask a question
Author

Kede Ma

Bio: Kede Ma is an academic researcher from City University of Hong Kong. The author has contributed to research in topics: Image quality & Computer science. The author has an hindex of 27, co-authored 73 publications receiving 3768 citations. Previous affiliations of Kede Ma include Courant Institute of Mathematical Sciences & Hong Kong Polytechnic University.

Papers published on a yearly basis

Papers
More filters
Journal ArticleDOI
TL;DR: This paper proposes a novel method by reserving room before encryption with a traditional RDH algorithm, and thus it is easy for the data hider to reversibly embed data in the encrypted image.
Abstract: Recently, more and more attention is paid to reversible data hiding (RDH) in encrypted images, since it maintains the excellent property that the original cover can be losslessly recovered after embedded data is extracted while protecting the image content's confidentiality. All previous methods embed data by reversibly vacating room from the encrypted images, which may be subject to some errors on data extraction and/or image restoration. In this paper, we propose a novel method by reserving room before encryption with a traditional RDH algorithm, and thus it is easy for the data hider to reversibly embed data in the encrypted image. The proposed method can achieve real reversibility, that is, data extraction and image recovery are free of any error. Experiments show that this novel method can embed more than 10 times as large payloads for the same image quality as the previous methods, such as for PSNR=40 dB.

610 citations

Journal ArticleDOI
TL;DR: This paper proposes a novel objective image quality assessment (IQA) algorithm for MEF images based on the principle of the structural similarity approach and a novel measure of patch structural consistency and shows that the proposed model well correlates with subjective judgments and significantly outperforms the existing IQA models for general image fusion.
Abstract: Multi-exposure image fusion (MEF) is considered an effective quality enhancement technique widely adopted in consumer electronics, but little work has been dedicated to the perceptual quality assessment of multi-exposure fused images. In this paper, we first build an MEF database and carry out a subjective user study to evaluate the quality of images generated by different MEF algorithms. There are several useful findings. First, considerable agreement has been observed among human subjects on the quality of MEF images. Second, no single state-of-the-art MEF algorithm produces the best quality for all test images. Third, the existing objective quality models for general image fusion are very limited in predicting perceived quality of MEF images. Motivated by the lack of appropriate objective models, we propose a novel objective image quality assessment (IQA) algorithm for MEF images based on the principle of the structural similarity approach and a novel measure of patch structural consistency. Our experimental results on the subjective database show that the proposed model well correlates with subjective judgments and significantly outperforms the existing IQA models for general image fusion. Finally, we demonstrate the potential application of the proposed model by automatically tuning the parameters of MEF algorithms. 1 The subjective database and the MATLAB code of the proposed model will be made available online. Preliminary results of Section III were presented at the 6th International Workshop on Quality of Multimedia Experience , Singapore, 2014.

530 citations

Journal ArticleDOI
TL;DR: This work establishes a large-scale database named the Waterloo Exploration Database, which in its current state contains 4744 pristine natural images and 94 880 distorted images created from them, and presents three alternative test criteria to evaluate the performance of IQA models, namely, the pristine/distorted image discriminability test, the listwise ranking consistency test, and the pairwise preference consistency test.
Abstract: The great content diversity of real-world digital images poses a grand challenge to image quality assessment (IQA) models, which are traditionally designed and validated on a handful of commonly used IQA databases with very limited content variation. To test the generalization capability and to facilitate the wide usage of IQA techniques in real-world applications, we establish a large-scale database named the Waterloo Exploration Database, which in its current state contains 4744 pristine natural images and 94 880 distorted images created from them. Instead of collecting the mean opinion score for each image via subjective testing, which is extremely difficult if not impossible, we present three alternative test criteria to evaluate the performance of IQA models, namely, the pristine/distorted image discriminability test, the listwise ranking consistency test, and the pairwise preference consistency test (P-test). We compare 20 well-known IQA models using the proposed criteria, which not only provide a stronger test in a more challenging testing environment for existing models, but also demonstrate the additional benefits of using the proposed database. For example, in the P-test, even for the best performing no-reference IQA model, more than 6 million failure cases against the model are “discovered” automatically out of over 1 billion test pairs. Furthermore, we discuss how the new database may be exploited using innovative approaches in the future, to reveal the weaknesses of existing IQA models, to provide insights on how to improve the models, and to shed light on how the next-generation IQA models may be developed. The database and codes are made publicly available at: https://ece.uwaterloo.ca/~k29ma/exploration/ .

495 citations

Journal ArticleDOI
TL;DR: This work demonstrates the strong competitiveness of MEON against state-of-the-art BIQA models using the group maximum differentiation competition methodology and empirically demonstrates that GDN is effective at reducing model parameters/layers while achieving similar quality prediction performance.
Abstract: We propose a multi-task end-to-end optimized deep neural network (MEON) for blind image quality assessment (BIQA). MEON consists of two sub-networks—a distortion identification network and a quality prediction network—sharing the early layers. Unlike traditional methods used for training multi-task networks, our training process is performed in two steps. In the first step, we train a distortion type identification sub-network, for which large-scale training samples are readily available. In the second step, starting from the pre-trained early layers and the outputs of the first sub-network, we train a quality prediction sub-network using a variant of the stochastic gradient descent method. Different from most deep neural networks, we choose biologically inspired generalized divisive normalization (GDN) instead of rectified linear unit as the activation function. We empirically demonstrate that GDN is effective at reducing model parameters/layers while achieving similar quality prediction performance. With modest model complexity, the proposed MEON index achieves state-of-the-art performance on four publicly available benchmarks. Moreover, we demonstrate the strong competitiveness of MEON against state-of-the-art BIQA models using the group maximum differentiation competition methodology.

391 citations

Journal ArticleDOI
TL;DR: A deep bilinear model for blind image quality assessment that works for both synthetically and authentically distorted images and achieves state-of-the-art performance on both synthetic and authentic IQA databases is proposed.
Abstract: We propose a deep bilinear model for blind image quality assessment that works for both synthetically and authentically distorted images. Our model constitutes two streams of deep convolutional neural networks (CNNs), specializing in two distortion scenarios separately. For synthetic distortions, we first pre-train a CNN to classify the distortion type and the level of an input image, whose ground truth label is readily available at a large scale. For authentic distortions, we make use of a pre-train CNN (VGG-16) for the image classification task. The two feature sets are bilinearly pooled into one representation for a final quality prediction. We fine-tune the whole network on the target databases using a variant of stochastic gradient descent. The extensive experimental results show that the proposed model achieves state-of-the-art performance on both synthetic and authentic IQA databases. Furthermore, we verify the generalizability of our method on the large-scale Waterloo Exploration Database, and demonstrate its competitiveness using the group maximum differentiation competition methodology.

390 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Proceedings Article
06 Aug 2017
TL;DR: A variant of GANs employing label conditioning that results in 128 x 128 resolution image samples exhibiting global coherence is constructed and it is demonstrated that high resolution samples provide class information not present in low resolution samples.
Abstract: In this paper we introduce new methods for the improved training of generative adversarial networks (GANs) for image synthesis. We construct a variant of GANs employing label conditioning that results in 128 x 128 resolution image samples exhibiting global coherence. We expand on previous work for image quality assessment to provide two new analyses for assessing the discriminability and diversity of samples from class-conditional image synthesis models. These analyses demonstrate that high resolution samples provide class information not present in low resolution samples. Across 1000 ImageNet classes, 128 x 128 samples are more than twice as discriminable as artificially resized 32 x 32 samples. In addition, 84.7% of the classes have samples exhibiting diversity comparable to real ImageNet data.

2,330 citations

Posted Content
TL;DR: In this article, a variant of GANs employing label conditioning was proposed to generate high resolution images. But the results showed that the high-resolution images were more than twice as discriminable as artificially resized 32x32 images.
Abstract: Synthesizing high resolution photorealistic images has been a long-standing challenge in machine learning. In this paper we introduce new methods for the improved training of generative adversarial networks (GANs) for image synthesis. We construct a variant of GANs employing label conditioning that results in 128x128 resolution image samples exhibiting global coherence. We expand on previous work for image quality assessment to provide two new analyses for assessing the discriminability and diversity of samples from class-conditional image synthesis models. These analyses demonstrate that high resolution samples provide class information not present in low resolution samples. Across 1000 ImageNet classes, 128x128 samples are more than twice as discriminable as artificially resized 32x32 samples. In addition, 84.7% of the classes have samples exhibiting diversity comparable to real ImageNet data.

1,444 citations

Journal ArticleDOI
TL;DR: FFDNet as discussed by the authors proposes a fast and flexible denoising convolutional neural network with a tunable noise level map as the input, which can handle a wide range of noise levels effectively with a single network.
Abstract: Due to the fast inference and good performance, discriminative learning methods have been widely studied in image denoising. However, these methods mostly learn a specific model for each noise level, and require multiple models for denoising images with different noise levels. They also lack flexibility to deal with spatially variant noise, limiting their applications in practical denoising. To address these issues, we present a fast and flexible denoising convolutional neural network, namely FFDNet, with a tunable noise level map as the input. The proposed FFDNet works on downsampled sub-images, achieving a good trade-off between inference speed and denoising performance. In contrast to the existing discriminative denoisers, FFDNet enjoys several desirable properties, including: 1) the ability to handle a wide range of noise levels (i.e., [0, 75]) effectively with a single network; 2) the ability to remove spatially variant noise by specifying a non-uniform noise level map; and 3) faster speed than benchmark BM3D even on CPU without sacrificing denoising performance. Extensive experiments on synthetic and real noisy images are conducted to evaluate FFDNet in comparison with state-of-the-art denoisers. The results show that FFDNet is effective and efficient, making it highly attractive for practical denoising applications.

1,430 citations

Proceedings ArticleDOI
21 Jul 2017
TL;DR: In this paper, a set of fast and effective CNN (convolutional neural network) denoisers and integrate them into model-based optimization method to solve other inverse problems (e.g., deblurring).
Abstract: Model-based optimization methods and discriminative learning methods have been the two dominant strategies for solving various inverse problems in low-level vision. Typically, those two kinds of methods have their respective merits and drawbacks, e.g., model-based optimization methods are flexible for handling different inverse problems but are usually time-consuming with sophisticated priors for the purpose of good performance, in the meanwhile, discriminative learning methods have fast testing speed but their application range is greatly restricted by the specialized task. Recent works have revealed that, with the aid of variable splitting techniques, denoiser prior can be plugged in as a modular part of model-based optimization methods to solve other inverse problems (e.g., deblurring). Such an integration induces considerable advantage when the denoiser is obtained via discriminative learning. However, the study of integration with fast discriminative denoiser prior is still lacking. To this end, this paper aims to train a set of fast and effective CNN (convolutional neural network) denoisers and integrate them into model-based optimization method to solve other inverse problems. Experimental results demonstrate that the learned set of denoisers can not only achieve promising Gaussian denoising results but also can be used as prior to deliver good performance for various low-level vision applications.

1,216 citations