scispace - formally typeset

Journal ArticleDOI

Image quality assessment: from error visibility to structural similarity

01 Apr 2004-IEEE Transactions on Image Processing (IEEE)-Vol. 13, Iss: 4, pp 600-612

Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.
Topics: Image quality (61%), Subjective video quality (56%), Human visual system model (56%), Image processing (54%), Visibility (geometry) (53%)
Citations
More filters

Proceedings ArticleDOI
21 Jul 2017-
TL;DR: Conditional adversarial networks are investigated as a general-purpose solution to image-to-image translation problems and it is demonstrated that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks.
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Moreover, since the release of the pix2pix software associated with this paper, hundreds of twitter users have posted their own artistic experiments using our system. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without handengineering our loss functions either.

9,134 citations


Posted Content
Abstract: We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. Indeed, since the release of the pix2pix software associated with this paper, a large number of internet users (many of them artists) have posted their own experiments with our system, further demonstrating its wide applicability and ease of adoption without the need for parameter tweaking. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either.

7,368 citations


Book ChapterDOI
Justin Johnson1, Alexandre Alahi1, Li Fei-Fei1Institutions (1)
08 Oct 2016-
Abstract: We consider image transformation problems, where an input image is transformed into an output image. Recent methods for such problems typically train feed-forward convolutional neural networks using a per-pixel loss between the output and ground-truth images. Parallel work has shown that high-quality images can be generated by defining and optimizing perceptual loss functions based on high-level features extracted from pretrained networks. We combine the benefits of both approaches, and propose the use of perceptual loss functions for training feed-forward networks for image transformation tasks. We show results on image style transfer, where a feed-forward network is trained to solve the optimization problem proposed by Gatys et al. in real-time. Compared to the optimization-based method, our network gives similar qualitative results but is three orders of magnitude faster. We also experiment with single-image super-resolution, where replacing a per-pixel loss with a perceptual loss gives visually pleasing results.

5,568 citations


Proceedings ArticleDOI
Christian Ledig1, Lucas Theis1, Ferenc Huszar2, Jose Caballero3  +7 moreInstitutions (3)
21 Jul 2017-
Abstract: Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper convolutional neural networks, one central problem remains largely unsolved: how do we recover the finer texture details when we super-resolve at large upscaling factors? The behavior of optimization-based super-resolution methods is principally driven by the choice of the objective function. Recent work has largely focused on minimizing the mean squared reconstruction error. The resulting estimates have high peak signal-to-noise ratios, but they are often lacking high-frequency details and are perceptually unsatisfying in the sense that they fail to match the fidelity expected at the higher resolution. In this paper, we present SRGAN, a generative adversarial network (GAN) for image super-resolution (SR). To our knowledge, it is the first framework capable of inferring photo-realistic natural images for 4x upscaling factors. To achieve this, we propose a perceptual loss function which consists of an adversarial loss and a content loss. The adversarial loss pushes our solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images. In addition, we use a content loss motivated by perceptual similarity instead of similarity in pixel space. Our deep residual network is able to recover photo-realistic textures from heavily downsampled images on public benchmarks. An extensive mean-opinion-score (MOS) test shows hugely significant gains in perceptual quality using SRGAN. The MOS scores obtained with SRGAN are closer to those of the original high-resolution images than to those obtained with any state-of-the-art method.

5,157 citations


Journal ArticleDOI
Abstract: We propose a deep learning method for single image super-resolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one. We further show that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network. But unlike traditional methods that handle each component separately, our method jointly optimizes all layers. Our deep CNN has a lightweight structure, yet demonstrates state-of-the-art restoration quality, and achieves fast speed for practical on-line usage. We explore different network structures and parameter settings to achieve trade-offs between performance and speed. Moreover, we extend our network to cope with three color channels simultaneously, and show better overall reconstruction quality.

4,680 citations


References
More filters

Journal ArticleDOI
Amir Said1, William A. Pearlman2Institutions (2)
TL;DR: The image coding results, calculated from actual file sizes and images reconstructed by the decoding algorithm, are either comparable to or surpass previous results obtained through much more sophisticated and computationally complex methods.
Abstract: Embedded zerotree wavelet (EZW) coding, introduced by Shapiro (see IEEE Trans. Signal Processing, vol.41, no.12, p.3445, 1993), is a very effective and computationally simple technique for image compression. We offer an alternative explanation of the principles of its operation, so that the reasons for its excellent performance can be better understood. These principles are partial ordering by magnitude with a set partitioning sorting algorithm, ordered bit plane transmission, and exploitation of self-similarity across different scales of an image wavelet transform. Moreover, we present a new and different implementation based on set partitioning in hierarchical trees (SPIHT), which provides even better performance than our previously reported extension of EZW that surpassed the performance of the original EZW. The image coding results, calculated from actual file sizes and images reconstructed by the decoding algorithm, are either comparable to or surpass previous results obtained through much more sophisticated and computationally complex methods. In addition, the new coding and decoding procedures are extremely fast, and they can be made even faster, with only small loss in performance, by omitting entropy coding of the bit stream by the arithmetic code.

5,812 citations


Journal ArticleDOI
J.M. Shapiro1Institutions (1)
TL;DR: The embedded zerotree wavelet algorithm (EZW) is a simple, yet remarkably effective, image compression algorithm, having the property that the bits in the bit stream are generated in order of importance, yielding a fully embedded code.
Abstract: The embedded zerotree wavelet algorithm (EZW) is a simple, yet remarkably effective, image compression algorithm, having the property that the bits in the bit stream are generated in order of importance, yielding a fully embedded code The embedded code represents a sequence of binary decisions that distinguish an image from the "null" image Using an embedded coding algorithm, an encoder can terminate the encoding at any point thereby allowing a target rate or target distortion metric to be met exactly Also, given a bit stream, the decoder can cease decoding at any point in the bit stream and still produce exactly the same image that would have been encoded at the bit rate corresponding to the truncated bit stream In addition to producing a fully embedded bit stream, the EZW consistently produces compression results that are competitive with virtually all known compression algorithms on standard test images Yet this performance is achieved with a technique that requires absolutely no training, no pre-stored tables or codebooks, and requires no prior knowledge of the image source The EZW algorithm is based on four key concepts: (1) a discrete wavelet transform or hierarchical subband decomposition, (2) prediction of the absence of significant information across scales by exploiting the self-similarity inherent in images, (3) entropy-coded successive-approximation quantization, and (4) universal lossless data compression which is achieved via adaptive arithmetic coding >

5,503 citations


Journal ArticleDOI
Zhou Wang, Alan C. Bovik1Institutions (1)
TL;DR: Although the new index is mathematically defined and no human visual system model is explicitly employed, experiments on various image distortion types indicate that it performs significantly better than the widely used distortion metric mean squared error.
Abstract: We propose a new universal objective image quality index, which is easy to calculate and applicable to various image processing applications. Instead of using traditional error summation methods, the proposed index is designed by modeling any image distortion as a combination of three factors: loss of correlation, luminance distortion, and contrast distortion. Although the new index is mathematically defined and no human visual system model is explicitly employed, our experiments on various image distortion types indicate that it performs significantly better than the widely used distortion metric mean squared error. Demonstrative images and an efficient MATLAB implementation of the algorithm are available online at http://anchovy.ece.utexas.edu//spl sim/zwang/research/quality_index/demo.html.

4,687 citations


Book
David Taubman1, Michael W. MarcellinInstitutions (1)
30 Nov 2001-
TL;DR: This work has specific applications for those involved in the development of software and hardware solutions for multimedia, internet, and medical imaging applications.
Abstract: This is nothing less than a totally essential reference for engineers and researchers in any field of work that involves the use of compressed imagery. Beginning with a thorough and up-to-date overview of the fundamentals of image compression, the authors move on to provide a complete description of the JPEG2000 standard. They then devote space to the implementation and exploitation of that standard. The final section describes other key image compression systems. This work has specific applications for those involved in the development of software and hardware solutions for multimedia, internet, and medical imaging applications.

2,938 citations


Journal ArticleDOI
TL;DR: Although some numerical measures correlate well with the observers' response for a given compression technique, they are not reliable for an evaluation across different techniques, and a graphical measure called Hosaka plots can be used to appropriately specify not only the amount, but also the type of degradation in reconstructed images.
Abstract: A number of quality measures are evaluated for gray scale image compression. They are all bivariate, exploiting the differences between corresponding pixels in the original and degraded images. It is shown that although some numerical measures correlate well with the observers' response for a given compression technique, they are not reliable for an evaluation across different techniques. A graphical measure called Hosaka plots, however, can be used to appropriately specify not only the amount, but also the type of degradation in reconstructed images.

1,445 citations


Network Information
Related Papers (5)
27 Jun 2016

Kaiming He, Xiangyu Zhang +2 more

01 Jan 2015

Diederik P. Kingma, Jimmy Ba

05 Oct 2015

Olaf Ronneberger, Philipp Fischer +1 more

08 Dec 2014

Ian Goodfellow, Jean Pouget-Abadie +6 more

21 Jul 2017

Phillip Isola, Jun-Yan Zhu +2 more

Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
202288
20214,656
20204,707
20193,815
20183,091
20172,407