scispace - formally typeset
Search or ask a question

Showing papers on "Image quality published in 2017"


Proceedings Article
06 Aug 2017
TL;DR: A variant of GANs employing label conditioning that results in 128 x 128 resolution image samples exhibiting global coherence is constructed and it is demonstrated that high resolution samples provide class information not present in low resolution samples.
Abstract: In this paper we introduce new methods for the improved training of generative adversarial networks (GANs) for image synthesis. We construct a variant of GANs employing label conditioning that results in 128 x 128 resolution image samples exhibiting global coherence. We expand on previous work for image quality assessment to provide two new analyses for assessing the discriminability and diversity of samples from class-conditional image synthesis models. These analyses demonstrate that high resolution samples provide class information not present in low resolution samples. Across 1000 ImageNet classes, 128 x 128 samples are more than twice as discriminable as artificially resized 32 x 32 samples. In addition, 84.7% of the classes have samples exhibiting diversity comparable to real ImageNet data.

2,330 citations


Journal ArticleDOI
TL;DR: It is shown that the quality of the results improves significantly with better loss functions, even when the network architecture is left unchanged, and a novel, differentiable error function is proposed.
Abstract: Neural networks are becoming central in several areas of computer vision and image processing and different architectures have been proposed to solve specific problems. The impact of the loss layer of neural networks, however, has not received much attention in the context of image processing: the default and virtually only choice is $\ell _2$ . In this paper, we bring attention to alternative choices for image restoration. In particular, we show the importance of perceptually-motivated losses when the resulting image is to be evaluated by a human observer. We compare the performance of several losses, and propose a novel, differentiable error function. We show that the quality of the results improves significantly with better loss functions, even when the network architecture is left unchanged.

1,758 citations


Proceedings ArticleDOI
01 Oct 2017
TL;DR: In this article, a novel application of automated texture synthesis in combination with a perceptual loss focusing on creating realistic textures rather than optimizing for a pixelaccurate reproduction of ground truth images during training is proposed.
Abstract: Single image super-resolution is the task of inferring a high-resolution image from a single low-resolution input. Traditionally, the performance of algorithms for this task is measured using pixel-wise reconstruction measures such as peak signal-to-noise ratio (PSNR) which have been shown to correlate poorly with the human perception of image quality. As a result, algorithms minimizing these metrics tend to produce over-smoothed images that lack highfrequency textures and do not look natural despite yielding high PSNR values.,,We propose a novel application of automated texture synthesis in combination with a perceptual loss focusing on creating realistic textures rather than optimizing for a pixelaccurate reproduction of ground truth images during training. By using feed-forward fully convolutional neural networks in an adversarial training setting, we achieve a significant boost in image quality at high magnification ratios. Extensive experiments on a number of datasets show the effectiveness of our approach, yielding state-of-the-art results in both quantitative and qualitative benchmarks.

791 citations


Journal ArticleDOI
TL;DR: A deep convolutional neural network is here used to map low-dose CT images towards its corresponding normal-dose counterparts in a patch-by-patch fashion, demonstrating a great potential of the proposed method on artifact reduction and structure preservation.
Abstract: In order to reduce the potential radiation risk, low-dose CT has attracted an increasing attention. However, simply lowering the radiation dose will significantly degrade the image quality. In this paper, we propose a new noise reduction method for low-dose CT via deep learning without accessing original projection data. A deep convolutional neural network is here used to map low-dose CT images towards its corresponding normal-dose counterparts in a patch-by-patch fashion. Qualitative results demonstrate a great potential of the proposed method on artifact reduction and structure preservation. In terms of the quantitative metrics, the proposed method has showed a substantial improvement on PSNR, RMSE and SSIM than the competing state-of-art methods. Furthermore, the speed of our method is one order of magnitude faster than the iterative reconstruction and patch-based image denoising methods.

603 citations


Journal ArticleDOI
25 Sep 2017-PLOS ONE
TL;DR: The MRI Quality Control tool (MRIQC), a tool for extracting quality measures and fitting a binary (accept/exclude) classifier, is introduced, which performs with high accuracy in intra-site prediction, but performance on unseen sites leaves space for improvement.
Abstract: Quality control of MRI is essential for excluding problematic acquisitions and avoiding bias in subsequent image processing and analysis. Visual inspection is subjective and impractical for large scale datasets. Although automated quality assessments have been demonstrated on single-site datasets, it is unclear that solutions can generalize to unseen data acquired at new sites. Here, we introduce the MRI Quality Control tool (MRIQC), a tool for extracting quality measures and fitting a binary (accept/exclude) classifier. Our tool can be run both locally and as a free online service via the OpenNeuro.org portal. The classifier is trained on a publicly available, multi-site dataset (17 sites, N = 1102). We perform model selection evaluating different normalization and feature exclusion approaches aimed at maximizing across-site generalization and estimate an accuracy of 76%±13% on new sites, using leave-one-site-out cross-validation. We confirm that result on a held-out dataset (2 sites, N = 265) also obtaining a 76% accuracy. Even though the performance of the trained classifier is statistically above chance, we show that it is susceptible to site effects and unable to account for artifacts specific to new sites. MRIQC performs with high accuracy in intra-site prediction, but performance on unseen sites leaves space for improvement which might require more labeled data and new approaches to the between-site variability. Overcoming these limitations is crucial for a more objective quality assessment of neuroimaging data, and to enable the analysis of extremely large and multi-site samples.

492 citations


Proceedings ArticleDOI
Andrey Ignatov1, Nikolay Kobyshev1, Radu Timofte1, Kenneth Vanhoey1, Luc Van Gool1 
01 Oct 2017
TL;DR: An end-to-end deep learning approach that bridges the gap by translating ordinary photos into DSLR-quality images by learning the translation function using a residual convolutional neural network that improves both color rendition and image sharpness.
Abstract: Despite a rapid rise in the quality of built-in smartphone cameras, their physical limitations – small sensor size, compact lenses and the lack of specific hardware, – impede them to achieve the quality results of DSLR cameras. In this work we present an end-to-end deep learning approach that bridges this gap by translating ordinary photos into DSLR-quality images. We propose learning the translation function using a residual convolutional neural network that improves both color rendition and image sharpness. Since the standard mean squared loss is not well suited for measuring perceptual image quality, we introduce a composite perceptual error function that combines content, color and texture losses. The first two losses are defined analytically, while the texture loss is learned in an adversarial fashion. We also present DPED, a large-scale dataset that consists of real photos captured from three different phones and one high-end reflex camera. Our quantitative and qualitative assessments reveal that the enhanced image quality is comparable to that of DSLR-taken photos, while the methodology is generalized to any type of digital camera.

423 citations


Posted Content
TL;DR: To allow fast and high‐quality reconstruction of clinical accelerated multi‐coil MR data by learning a variational network that combines the mathematical structure of variational models with deep learning.
Abstract: Purpose: To allow fast and high-quality reconstruction of clinical accelerated multi-coil MR data by learning a variational network that combines the mathematical structure of variational models with deep learning. Theory and Methods: Generalized compressed sensing reconstruction formulated as a variational model is embedded in an unrolled gradient descent scheme. All parameters of this formulation, including the prior model defined by filter kernels and activation functions as well as the data term weights, are learned during an offline training procedure. The learned model can then be applied online to previously unseen data. Results: The variational network approach is evaluated on a clinical knee imaging protocol. The variational network reconstructions outperform standard reconstruction algorithms in terms of image quality and residual artifacts for all tested acceleration factors and sampling patterns. Conclusion: Variational network reconstructions preserve the natural appearance of MR images as well as pathologies that were not included in the training data set. Due to its high computational performance, i.e., reconstruction time of 193 ms on a single graphics card, and the omission of parameter tuning once the network is trained, this new approach to image reconstruction can easily be integrated into clinical workflow.

366 citations


Journal ArticleDOI
TL;DR: A blind image evaluator based on a convolutional neural network (BIECON) is proposed that follows the FR-IQA behavior using the local quality maps as intermediate targets for conventional neural networks, which leads to NR- IQA prediction accuracy that is comparable with that of state-of-the-art FR-iqA methods.
Abstract: In general, owing to the benefits obtained from original information, full-reference image quality assessment (FR-IQA) achieves relatively higher prediction accuracy than no-reference image quality assessment (NR-IQA). By fully utilizing reference images, conventional FR-IQA methods have been investigated to produce objective scores that are close to subjective scores. In contrast, NR-IQA does not consider reference images; thus, its performance is inferior to that of FR-IQA. To alleviate this accuracy discrepancy between FR-IQA and NR-IQA methods, we propose a blind image evaluator based on a convolutional neural network (BIECON). To imitate FR-IQA behavior, we adopt the strong representation power of a deep convolutional neural network to generate a local quality map, similar to FR-IQA. To obtain the best results from the deep neural network, replacing hand-crafted features with automatically learned features is necessary. To apply the deep model to the NR-IQA framework, three critical problems must be resolved: 1) lack of training data; 2) absence of local ground truth targets; and 3) different purposes of feature learning. BIECON follows the FR-IQA behavior using the local quality maps as intermediate targets for conventional neural networks, which leads to NR-IQA prediction accuracy that is comparable with that of state-of-the-art FR-IQA methods.

364 citations


Proceedings ArticleDOI
26 Jul 2017
TL;DR: This work proposes a no-reference image quality assessment (NR-IQA) approach that learns from rankings (RankIQA), and demonstrates how this approach can be made significantly more efficient than traditional Siamese Networks by forward propagating a batch of images through a single network and backpropagating gradients derived from all pairs of images in the batch.
Abstract: We propose a no-reference image quality assessment (NR-IQA) approach that learns from rankings (RankIQA). To address the problem of limited IQA dataset size, we train a Siamese Network to rank images in terms of image quality by using synthetically generated distortions for which relative image quality is known. These ranked image sets can be automatically generated without laborious human labeling. We then use fine-tuning to transfer the knowledge represented in the trained Siamese Network to a traditional CNN that estimates absolute image quality from single images. We demonstrate how our approach can be made significantly more efficient than traditional Siamese Networks by forward propagating a batch of images through a single network and backpropagating gradients derived from all pairs of images in the batch. Experiments on the TID2013 benchmark show that we improve the state-of-theart by over 5%. Furthermore, on the LIVE benchmark we show that our approach is superior to existing NR-IQA techniques and that we even outperform the state-of-the-art in full-reference IQA (FR-IQA) methods without having to resort to high-quality reference images to infer IQA.

316 citations


Journal ArticleDOI
TL;DR: It is demonstrated that the proposed novel deep learning-based generative adversarial model, RefineGAN, outperforms the state-of-the-art CS-MRI methods by a large margin in terms of both running time and image quality via evaluation using several open-source MRI databases.
Abstract: Compressed Sensing MRI (CS-MRI) has provided theoretical foundations upon which the time-consuming MRI acquisition process can be accelerated. However, it primarily relies on iterative numerical solvers which still hinders their adaptation in time-critical applications. In addition, recent advances in deep neural networks have shown their potential in computer vision and image processing, but their adaptation to MRI reconstruction is still in an early stage. In this paper, we propose a novel deep learning-based generative adversarial model, RefineGAN, for fast and accurate CS-MRI reconstruction. The proposed model is a variant of fully-residual convolutional autoencoder and generative adversarial networks (GANs), specifically designed for CS-MRI formulation; it employs deeper generator and discriminator networks with cyclic data consistency loss for faithful interpolation in the given under-sampled k-space data. In addition, our solution leverages a chained network to further enhance the reconstruction quality. RefineGAN is fast and accurate -- the reconstruction process is extremely rapid, as low as tens of milliseconds for reconstruction of a 256x256 image, because it is one-way deployment on a feed-forward network, and the image quality is superior even for extremely low sampling rate (as low as 10%) due to the data-driven nature of the method. We demonstrate that RefineGAN outperforms the state-of-the-art CS-MRI methods by a large margin in terms of both running time and image quality via evaluation using several open-source MRI databases.

287 citations


Journal ArticleDOI
TL;DR: The proposed blind IQA method generates an overall quality estimation of a contrast-distorted image by properly combining local and global considerations and demonstrates the superiority of the training-free blind technique over state-of-the-art full- and no-reference IQA methods.
Abstract: The general purpose of seeing a picture is to attain information as much as possible. With it, we in this paper devise a new no-reference/blind metric for image quality assessment (IQA) of contrast distortion. For local details, we first roughly remove predicted regions in an image since unpredicted remains are of much information. We then compute entropy of particular unpredicted areas of maximum information via visual saliency. From global perspective, we compare the image histogram with the uniformly distributed histogram of maximum information via the symmetric Kullback–Leibler divergence. The proposed blind IQA method generates an overall quality estimation of a contrast-distorted image by properly combining local and global considerations. Thorough experiments on five databases/subsets demonstrate the superiority of our training-free blind technique over state-of-the-art full- and no-reference IQA methods. Furthermore, the proposed model is also applied to amend the performance of general-purpose blind quality metrics to a sizable margin.

Journal ArticleDOI
TL;DR: In this article, the authors proposed an efficient method to produce an image that is significantly sharper than the input blurry one, without introducing artifacts, such as halos and noise amplification, which can be used as a preprocessing step to induce the learning of more effective upscaling filters with built-in sharpening and contrast enhancement.
Abstract: Given an image, we wish to produce an image of larger size with significantly more pixels and higher image quality. This is generally known as the single image super-resolution problem. The idea is that with sufficient training data (corresponding pairs of low and high resolution images) we can learn set of filters (i.e., a mapping) that when applied to given image that is not in the training set, will produce a higher resolution version of it, where the learning is preferably low complexity. In our proposed approach, the run-time is more than one to two orders of magnitude faster than the best competing methods currently available, while producing results comparable or better than state-of-the-art. A closely related topic is image sharpening and contrast enhancement, i.e., improving the visual quality of a blurry image by amplifying the underlying details (a wide range of frequencies). Our approach additionally includes an extremely efficient way to produce an image that is significantly sharper than the input blurry one, without introducing artifacts, such as halos and noise amplification. We illustrate how this effective sharpening algorithm, in addition to being of independent interest, can be used as a preprocessing step to induce the learning of more effective upscaling filters with built-in sharpening and contrast enhancement effect.

Journal ArticleDOI
TL;DR: A new perceptual image quality assessment (IQA) metric based on the human visual system (HVS) is proposed that performs efficiently with convolution operations at multiscales, gradient magnitude, and color information similarity, and a perceptual-based pooling.
Abstract: A fast reliable computational quality predictor is eagerly desired in practical image/video applications, such as serving for the quality monitoring of real-time coding and transcoding. In this paper, we propose a new perceptual image quality assessment (IQA) metric based on the human visual system (HVS). The proposed IQA model performs efficiently with convolution operations at multiscales, gradient magnitude, and color information similarity, and a perceptual-based pooling. Extensive experiments are conducted using four popular large-size image databases and two multiply distorted image databases, and results validate the superiority of our approach over modern IQA measures in efficiency and efficacy. Our metric is built on the theoretical support of the HVS with lately designed IQA methods as special cases.

Proceedings ArticleDOI
21 Jul 2017
TL;DR: A novel convolutional neural networks (CNN) based FR-IQA model, named Deep Image Quality Assessment (DeepQA), where the behavior of the HVS is learned from the underlying data distribution of IQA databases, which achieves the state-of-the-art prediction accuracy among FR- IQA models.
Abstract: Since human observers are the ultimate receivers of digital images, image quality metrics should be designed from a human-oriented perspective. Conventionally, a number of full-reference image quality assessment (FR-IQA) methods adopted various computational models of the human visual system (HVS) from psychological vision science research. In this paper, we propose a novel convolutional neural networks (CNN) based FR-IQA model, named Deep Image Quality Assessment (DeepQA), where the behavior of the HVS is learned from the underlying data distribution of IQA databases. Different from previous studies, our model seeks the optimal visual weight based on understanding of database information itself without any prior knowledge of the HVS. Through the experiments, we show that the predicted visual sensitivity maps agree with the human subjective opinions. In addition, DeepQA achieves the state-of-the-art prediction accuracy among FR-IQA models.

Journal ArticleDOI
TL;DR: A novel blind/no-reference (NR) model for accessing the perceptual quality of screen content pictures with big data learning and delivers computational efficiency and promising performance.
Abstract: Recent years have witnessed a growing number of image and video centric applications on mobile, vehicular, and cloud platforms, involving a wide variety of digital screen content images Unlike natural scene images captured with modern high fidelity cameras, screen content images are typically composed of fewer colors, simpler shapes, and a larger frequency of thin lines In this paper, we develop a novel blind/no-reference (NR) model for accessing the perceptual quality of screen content pictures with big data learning The new model extracts four types of features descriptive of the picture complexity, of screen content statistics, of global brightness quality, and of the sharpness of details Comparative experiments verify the efficacy of the new model as compared with existing relevant blind picture quality assessment algorithms applied on screen content image databases A regression module is trained on a considerable number of training samples labeled with objective visual quality predictions delivered by a high-performance full-reference method designed for screen content image quality assessment (IQA) This results in an opinion-unaware NR blind screen content IQA algorithm Our proposed model delivers computational efficiency and promising performance The source code of the new model will be available at: https://sitesgooglecom/site/guke198701/publications

Journal ArticleDOI
TL;DR: In this article, a bag-of-features approach is proposed to capture consistencies or departures therefrom of the statistics of real-world images in different color spaces and transform domains.
Abstract: Current top-performing blind perceptual image quality prediction models are generally trained on legacy databases of human quality opinion scores on synthetically distorted images. Therefore, they learn image features that effectively predict human visual quality judgments of inauthentic and usually isolated (single) distortions. However, real-world images usually contain complex composite mixtures of multiple distortions. We study the perceptually relevant natural scene statistics of such authentically distorted images in different color spaces and transform domains. We propose a "bag of feature maps" approach that avoids assumptions about the type of distortion(s) contained in an image and instead focuses on capturing consistencies-or departures therefrom-of the statistics of real-world images. Using a large database of authentically distorted images, human opinions of them, and bags of features computed on them, we train a regressor to conduct image quality prediction. We demonstrate the competence of the features toward improving automatic perceptual quality prediction by testing a learned algorithm using them on a benchmark legacy database as well as on a newly introduced distortion-realistic resource called the LIVE In the Wild Image Quality Challenge Database. We extensively evaluate the perceptual quality prediction model and algorithm and show that it is able to achieve good-quality prediction power that is better than other leading models.

Posted ContentDOI
22 Aug 2017-bioRxiv
TL;DR: The MRI Quality Control tool (MRIQC), a tool for extracting quality measures and fitting a binary (accept/exclude) classifier, is introduced, which performs with high accuracy in intra-site prediction, but performance on unseen sites leaves space for improvement.
Abstract: Quality control of MRI is essential for excluding problematic acquisitions and avoiding bias in subsequent image processing and analysis. Visual inspection is subjective and impractical for large scale datasets. Although automated quality assessments have been demonstrated on single-site datasets, it is unclear that solutions can generalize to unseen data acquired at new sites. Here, we introduce the MRI Quality Control tool (MRIQC), a tool for extracting quality measures and fitting a binary (accept/exclude) classifier. Our tool can be run both locally and as a free online service via the OpenNeuro.org portal. The classifier is trained on a publicly available, multi-site dataset (17 sites, N=1102). We perform model selection evaluating different normalization and feature exclusion approaches aimed at maximizing across-site generalization and estimate an accuracy of 76%±13% on new sites, using leave-one-site-out cross-validation. We confirm that result on a held-out dataset (2 sites, N=265) also obtaining a 76% accuracy. Even though the performance of the trained classifier is statistically above chance, we show that it is susceptible to site effects and unable to account for artifacts specific to new sites. MRIQC performs with high accuracy in intra-site prediction, but performance on unseen sites leaves space for improvement which might require more labeled data and new approaches to the between-site variability. Overcoming these limitations is crucial for a more objective quality assessment of neuroimaging data, and to enable the analysis of extremely large and multi-site samples.

Journal ArticleDOI
TL;DR: A new method of separable data hiding in encrypted images are proposed by using CS and discrete fourier transform, which takes full advantage of both real and imaginary coefficients for ensuring great recovery and providing flexible payload.
Abstract: Reversible data hiding in encrypted images has become an effective and popular way to preserve the security and privacy of users’ personal images. Recently, Xiao et al. firstly presented reversible data hiding in encrypted images with use of the modern signal processing technique compressive sensing (CS). However, the quality of decrypted image is not great enough. In this paper, a new method of separable data hiding in encrypted images are proposed by using CS and discrete fourier transform, which takes full advantage of both real and imaginary coefficients for ensuring great recovery and providing flexible payload. Compared with the original work, the proposed method can obtain better image quality when concealing the same embedding capacity. Furthermore, image decryption and data extraction are separable in the proposed method, and the secret data can be extracted relatively accurately.

Posted Content
TL;DR: In this article, the authors proposed the deep Laplacian pyramid super-resolution network (LAPS-Net), which progressively reconstructs the sub-band residuals of high-resolution images at multiple pyramid levels.
Abstract: Convolutional neural networks have recently demonstrated high-quality reconstruction for single image super-resolution. However, existing methods often require a large number of network parameters and entail heavy computational loads at runtime for generating high-accuracy super-resolution results. In this paper, we propose the deep Laplacian Pyramid Super-Resolution Network for fast and accurate image super-resolution. The proposed network progressively reconstructs the sub-band residuals of high-resolution images at multiple pyramid levels. In contrast to existing methods that involve the bicubic interpolation for pre-processing (which results in large feature maps), the proposed method directly extracts features from the low-resolution input space and thereby entails low computational loads. We train the proposed network with deep supervision using the robust Charbonnier loss functions and achieve high-quality image reconstruction. Furthermore, we utilize the recursive layers to share parameters across as well as within pyramid levels, and thus drastically reduce the number of parameters. Extensive quantitative and qualitative evaluations on benchmark datasets show that the proposed algorithm performs favorably against the state-of-the-art methods in terms of run-time and image quality.

Journal ArticleDOI
TL;DR: In this article, the authors compare the performance of HSI and Fourier single-pixel imaging with theoretical analysis and experiments, and show that FSI is more efficient than HSI while HSI was more noise-robust than FSI.
Abstract: Single-pixel imaging which employs active illumination to acquire spatial information is an innovative imaging scheme and has received increasing attentions in recent years. It is applicable to imaging at non-visible wavelengths and imaging under low light conditions. However, single-pixel imaging has once encountered problems of low reconstruction quality and long data-acquisition time. Hadamard single-pixel imaging (HSI) and Fourier single-pixel imaging (FSI) are two representative deterministic model based techniques. Both techniques are able to achieve high-quality and efficient imaging, remarkably improving the applicability of single-pixel imaging scheme. In this paper, we compare the performances of HSI and FSI with theoretical analysis and experiments. The results show that FSI is more efficient than HSI while HSI is more noise-robust than FSI. Our work may provide a guideline for researchers to choose suitable single-pixel imaging technique for their applications.

Proceedings ArticleDOI
01 Jul 2017
TL;DR: A novel multispectral Region Proposal Network (RPN) that is built up on the pre-trained very deep convolutional network VGG-16 that is evaluated using a Boosted Decision Trees classifier in order to reduce potential false positive detections.
Abstract: Multispectral images that combine visual-optical (VIS) and infrared (IR) image information are a promising source of data for automatic person detection. Especially in automotive or surveillance applications, challenging conditions such as insufficient illumination or large distances between camera and object occur regularly and can affect image quality. This leads to weak image contrast or low object resolution. In order to detect persons under such conditions, we apply deep learning for effectively fusing the VIS and IR information in multispectral images. We present a novel multispectral Region Proposal Network (RPN) that is built up on the pre-trained very deep convolutional network VGG-16. The proposals of this network are further evaluated using a Boosted Decision Trees classifier in order to reduce potential false positive detections. With a log-average miss rate of 29:83% on the reasonable test set of the KAIST Multispectral Pedestrian Detection Benchmark, we improve the current state-of-the-art by about 18%.

Proceedings ArticleDOI
Xueyang Fu1, Zhiwen Fan1, Mei Ling1, Yue Huang1, Xinghao Ding1 
01 Nov 2017
TL;DR: A novel optimal contrast improvement method is discussed, which is efficient and can reduce artifacts, to address the low contrast and is straightforward to implement and appropriate for real-time application.
Abstract: Underwater images often suffer from color shift and contrast degradation due to the absorption and scattering of light while traveling in water. In order to handle these issues, we present and solve two sub-problems to improve underwater image quality. First, we introduce an effective color correcting strategy based on piece-wise linear transformation to address the color distortion. Then we discuss a novel optimal contrast improvement method, which is efficient and can reduce artifacts, to address the low contrast. Since most operations are pixel-wise calculations, the proposed method is straightforward to implement and appropriate for real-time application. In addition, prior knowledge about imaging conditions is not required. Experiments show an improvement in the enhanced image of color, contrast, naturalness and object prominence.

Journal ArticleDOI
TL;DR: A unified content-type adaptive (UCA) blind image quality assessment model that is applicable across content types and leads to superior performance on the constructed CCT database, and is training-free, implying strong generalizability.
Abstract: Digital images in the real world are created by a variety of means and have diverse properties. A photographical natural scene image (NSI) may exhibit substantially different characteristics from a computer graphic image (CGI) or a screen content image (SCI). This casts major challenges to objective image quality assessment, for which existing approaches lack effective mechanisms to capture such content type variations, and thus are difficult to generalize from one type to another. To tackle this problem, we first construct a cross-content-type (CCT) database, which contains 1,320 distorted NSIs, CGIs, and SCIs, compressed using the high efficiency video coding (HEVC) intra coding method and the screen content compression (SCC) extension of HEVC. We then carry out a subjective experiment on the database in a well-controlled laboratory environment. Moreover, we propose a unified content-type adaptive (UCA) blind image quality assessment model that is applicable across content types. A key step in UCA is to incorporate the variations of human perceptual characteristics in viewing different content types through a multi-scale weighting framework. This leads to superior performance on the constructed CCT database. UCA is training-free, implying strong generalizability. To verify this, we test UCA on other databases containing JPEG, MPEG-2, H.264, and HEVC compressed images/videos, and observe that it consistently achieves competitive performance.

Journal ArticleDOI
TL;DR: This paper shows that a vast amount of reliable training data in the form of quality-discriminable image pairs (DIPs) can be obtained automatically at low cost by exploiting large-scale databases with diverse image content, and learns an opinion-unaware BIQA (OU-BIQA, meaning that no subjective opinions are used for training) model from millions of DIPs, leading to a DIP inferred quality (dipIQ) index.
Abstract: Objective assessment of image quality is fundamentally important in many image processing tasks. In this paper, we focus on learning blind image quality assessment (BIQA) models, which predict the quality of a digital image with no access to its original pristine-quality counterpart as reference. One of the biggest challenges in learning BIQA models is the conflict between the gigantic image space (which is in the dimension of the number of image pixels) and the extremely limited reliable ground truth data for training. Such data are typically collected via subjective testing, which is cumbersome, slow, and expensive. Here, we first show that a vast amount of reliable training data in the form of quality-discriminable image pairs (DIPs) can be obtained automatically at low cost by exploiting large-scale databases with diverse image content. We then learn an opinion-unaware BIQA (OU-BIQA, meaning that no subjective opinions are used for training) model using RankNet, a pairwise learning-to-rank (L2R) algorithm, from millions of DIPs, each associated with a perceptual uncertainty level, leading to a DIP inferred quality (dipIQ) index. Extensive experiments on four benchmark IQA databases demonstrate that dipIQ outperforms the state-of-the-art OU-BIQA models. The robustness of dipIQ is also significantly improved as confirmed by the group MAximum Differentiation competition method. Furthermore, we extend the proposed framework by learning models with ListNet (a listwise L2R algorithm) on quality-discriminable image lists (DIL). The resulting DIL inferred quality index achieves an additional performance gain.

Journal ArticleDOI
Lingyun Wu1, Jie-Zhi Cheng1, Shengli Li, Baiying Lei1, Tianfu Wang1, Dong Ni1 
TL;DR: It will be illustrated that the computerized assessment with the FUIQA scheme can be comparable to the subjective ratings from medical doctors.
Abstract: The quality of ultrasound (US) images for the obstetric examination is crucial for accurate biometric measurement. However, manual quality control is a labor intensive process and often impractical in a clinical setting. To improve the efficiency of examination and alleviate the measurement error caused by improper US scanning operation and slice selection, a computerized fetal US image quality assessment (FUIQA) scheme is proposed to assist the implementation of US image quality control in the clinical obstetric examination. The proposed FUIQA is realized with two deep convolutional neural network models, which are denoted as L-CNN and C-CNN, respectively. The L-CNN aims to find the region of interest (ROI) of the fetal abdominal region in the US image. Based on the ROI found by the L-CNN, the C-CNN evaluates the image quality by assessing the goodness of depiction for the key structures of stomach bubble and umbilical vein. To further boost the performance of the L-CNN, we augment the input sources of the neural network with the local phase features along with the original US data. It will be shown that the heterogeneous input sources will help to improve the performance of the L-CNN. The performance of the proposed FUIQA is compared with the subjective image quality evaluation results from three medical doctors. With comprehensive experiments, it will be illustrated that the computerized assessment with our FUIQA scheme can be comparable to the subjective ratings from medical doctors.

Posted Content
Andrey Ignatov1, Nikolay Kobyshev1, Radu Timofte1, Kenneth Vanhoey1, Luc Van Gool1 
TL;DR: In this article, a residual convolutional neural network was proposed to translate ordinary photos into DSLR-quality images by combining content, color, and texture losses, where the first two losses are defined analytically, while the texture loss is learned in an adversarial fashion.
Abstract: Despite a rapid rise in the quality of built-in smartphone cameras, their physical limitations - small sensor size, compact lenses and the lack of specific hardware, - impede them to achieve the quality results of DSLR cameras. In this work we present an end-to-end deep learning approach that bridges this gap by translating ordinary photos into DSLR-quality images. We propose learning the translation function using a residual convolutional neural network that improves both color rendition and image sharpness. Since the standard mean squared loss is not well suited for measuring perceptual image quality, we introduce a composite perceptual error function that combines content, color and texture losses. The first two losses are defined analytically, while the texture loss is learned in an adversarial fashion. We also present DPED, a large-scale dataset that consists of real photos captured from three different phones and one high-end reflex camera. Our quantitative and qualitative assessments reveal that the enhanced image quality is comparable to that of DSLR-taken photos, while the methodology is generalized to any type of digital camera.

Journal ArticleDOI
TL;DR: The results demonstrate that the proposed tensor-based method generally produces superior image quality, and leads to more accurate material decomposition than the currently popular popular methods.
Abstract: Spectral computed tomography (CT) produces an energy-discriminative attenuation map of an object, extending a conventional image volume with a spectral dimension. In spectral CT, an image can be sparsely represented in each of multiple energy channels, and are highly correlated among energy channels. According to this characteristics, we propose a tensor-based dictionary learning method for spectral CT reconstruction. In our method, tensor patches are extracted from an image tensor, which is reconstructed using the filtered backprojection (FBP), to form a training dataset. With the Candecomp/Parafac decomposition, a tensor-based dictionary is trained, in which each atom is a rank-one tensor. Then, the trained dictionary is used to sparsely represent image tensor patches during an iterative reconstruction process, and the alternating minimization scheme is adapted for optimization. The effectiveness of our proposed method is validated with both numerically simulated and real preclinical mouse datasets. The results demonstrate that the proposed tensor-based method generally produces superior image quality, and leads to more accurate material decomposition than the currently popular popular methods.

Posted Content
TL;DR: In this article, a CoMatch layer was introduced to match the second order feature statistics with the target styles, which achieved real-time brush-size control in a purely feed-forward manner for style transfer.
Abstract: Despite the rapid progress in style transfer, existing approaches using feed-forward generative network for multi-style or arbitrary-style transfer are usually compromised of image quality and model flexibility. We find it is fundamentally difficult to achieve comprehensive style modeling using 1-dimensional style embedding. Motivated by this, we introduce CoMatch Layer that learns to match the second order feature statistics with the target styles. With the CoMatch Layer, we build a Multi-style Generative Network (MSG-Net), which achieves real-time performance. We also employ an specific strategy of upsampled convolution which avoids checkerboard artifacts caused by fractionally-strided convolution. Our method has achieved superior image quality comparing to state-of-the-art approaches. The proposed MSG-Net as a general approach for real-time style transfer is compatible with most existing techniques including content-style interpolation, color-preserving, spatial control and brush stroke size control. MSG-Net is the first to achieve real-time brush-size control in a purely feed-forward manner for style transfer. Our implementations and pre-trained models for Torch, PyTorch and MXNet frameworks will be publicly available.

Journal ArticleDOI
TL;DR: A new no-reference image quality assessment (NR IQA) model for HDR pictures that is based on standard measurements of the bandpass and on newly conceived differential natural scene statistics (NSS) of HDR pictures is described, which is derived from an algorithm which is called the HDR IMAGE GRADient-based Evaluator.
Abstract: Being able to automatically predict digital picture quality, as perceived by human observers, has become important in many applications where humans are the ultimate consumers of displayed visual information. Standard dynamic range (SDR) images provide 8 b/color/pixel. High dynamic range (HDR) images, which are usually created from multiple exposures of the same scene, can provide 16 or 32 b/color/pixel, but must be tonemapped to SDR for display on standard monitors. Multi-exposure fusion techniques bypass HDR creation, by fusing the exposure stack directly to SDR format while aiming for aesthetically pleasing luminance and color distributions. Here, we describe a new no-reference image quality assessment (NR IQA) model for HDR pictures that is based on standard measurements of the bandpass and on newly conceived differential natural scene statistics (NSS) of HDR pictures. We derive an algorithm from the model which we call the HDR IMAGE GRADient-based Evaluator. NSS models have previously been used to devise NR IQA models that effectively predict the subjective quality of SDR images, but they perform significantly worse on tonemapped HDR content. Toward ameliorating this we make here the following contributions: 1) we design HDR picture NR IQA models and algorithms using both standard space-domain NSS features as well as novel HDR-specific gradient-based features that significantly elevate prediction performance; 2) we validate the proposed models on a large-scale crowdsourced HDR image database; and 3) we demonstrate that the proposed models also perform well on legacy natural SDR images. The software is available at: http://live.ece.utexas.edu/research/Quality/higradeRelease.zip .

Journal ArticleDOI
TL;DR: Thorough experiments conducted on standard databases show that the proposed novel full-reference IQA framework, codenamed DeepSim, can accurately predict human perceived image quality and outperforms previous state-of-the-art performance.