scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Are Recent SISR Techniques Suitable for Industrial Applications at Low Magnification

01 Dec 2019-IEEE Transactions on Industrial Electronics (Institute of Electrical and Electronics Engineers (IEEE))-Vol. 66, Iss: 12, pp 9828-9836
TL;DR: A fast image upsampling method designed specifically for industrial applications at low magnification that can obtain performance comparable to that of some state-of-the-art methods for 720P-to-1080P magnification, but the computational cost is much lower.
Abstract: In recent years, many deep-network-based super-resolution techniques have been proposed and have achieved impressive results for 2 ${\times }$ and higher magnification factors. However, lower magnification factors encountered in some industrial applications have not received special attention, such as 720P-to-1080P (1.5 ${\times }$ magnification). Compared to traditional 2 ${\times }$ or higher magnification factors, these lower magnifications are much simpler, but reconstructions of high-definition images are time-consuming and computationally complex. Hence, in this paper, a fast image upsampling method is designed specifically for industrial applications at low magnification. In the proposed method, edge and nonedge areas are first distinguished and then reconstructed via different fast approaches. For the edge area, a local edge pattern encoding-based method is presented to recover sharp edges. For the nonedge area, a global iterative reconstruction with texture constraint is utilized. Moreover, some acceleration strategies are also presented to further reduce the complexity. The experimental results demonstrate that the proposed method can obtain performance comparable to that of some state-of-the-art methods for 720P-to-1080P magnification, but the computational cost is much lower.
Citations
More filters
Journal ArticleDOI
TL;DR: This work designs a dual-channel network through 2D and 3D convolution to jointly exploit the information from both single band and adjacent bands, which is different from previous works.
Abstract: Deep learning-based hyperspectral image superresolution methods have achieved great success recently. However, most methods utilize 2D or 3D convolution to explore features, and rarely combine the two types of convolution to design networks. Moreover, when the model only contains 3D convolution, almost all the methods take all the bands of hyperspectral image as input to analyze, which requires more memory footprint. To address these issues, we explore a new structure for hyperspectral image superresolution using spectrum and feature context. Inspired by the high similarity among adjacent bands, we design a dual-channel network through 2D and 3D convolution to jointly exploit the information from both single band and adjacent bands, which is different from previous works. Under the connection of depth split, it can effectively share spatial information so as to improve the learning ability of 2D spatial domain. Besides, our method introduces the features extracted from previous band, which contributes to the complementarity of information and simplifies the network structure. Through feature context fusion, it significantly enhances the performance of the algorithm. Extensive evaluations and comparisons on three public datasets demonstrate that our approach produces the state-of-the-art results over the existing approaches.

53 citations

Journal ArticleDOI
TL;DR: This article proposes a dense discriminative network that is composed of several aggregation modules (AM), which merges extraction and integration nodes in a tree structure, which can aggregate features progressively in an efficient way.
Abstract: Deep convolutional neural networks have recently made a considerable achievement in the single-image superresolution (SISR) problem. Most CNN architectures for SISR incorporate long or short connections to integrate features, and treat them equally. However, they neglect the discrimination of features, and consequently, achieving relatively poor performance. To address this problem, in this article, we propose a dense discriminative network that is composed of several aggregation modules (AM). Specifically, the AM merges extraction and integration nodes in a tree structure, which can aggregate features progressively in an efficient way. In particular, we compress and rescale the densely connected information in the aggregation node by modeling the interaction between channels, which shares the same insight with the attention mechanism for improving the discriminative ability of network. Extensive experiments conducted on several publicly available datasets have demonstrated the superiority of our model over state-of-the-art in objective metrics and visual impressions.

50 citations


Cites background from "Are Recent SISR Techniques Suitable..."

  • ...At present, highperformance and low-cost superresolution (SR) techniques are still in high demand by many related industrial applications such as the video industry and display device industry [1]....

    [...]

Journal ArticleDOI
Yuan Chen1, Yang Zhao1, Wei Jia1, Li Cao1, Xiaoping Liu1 
TL;DR: This survey aims to provide an overview of adversarial-learning-based methods by focusing on the image-to-image transformation scenario and introduces the network architectures of generative models and loss functions.

17 citations

Proceedings ArticleDOI
Yanjie Xu1, Xin li1
01 Oct 2019
TL;DR: Experimental results show that this super-resolution image reconstruction method can produce clearer face images than the traditional methods and have higher resolution and Peak Signal to Noise Ratio (PSNR) and Structural Similarity Index than the images generated by the deep residual networks.
Abstract: Super-resolution reconstruction technology is an important research topic in many fields such as image processing and computer vision. This technology can be used widely for security monitoring, old image reconstruction, image compression transmission and other fields. In this paper, super-resolution image reconstruction is performed on a low-resolution image of four times magnification. We propose the dense convolutional networks used as a generator instead of residual networks, and set perceptual loss as the optimization goal. We use the VGG network feature map as the loss function instead of Mean Square Error, which combines the perceptual loss with the adversarial loss and is beneficial for compensating the shortcomings of previous methods that lack high frequency detail. Experimental results show that our method can produce clearer face images than the traditional methods. These reconstructed images have higher resolution and Peak Signal to Noise Ratio (PSNR) and Structural Similarity Index (SSIM) than the images generated by the deep residual networks.

2 citations


Cites background from "Are Recent SISR Techniques Suitable..."

  • ...According to the quantity of lowresolution images, it can be divided into two categories: one is to reconstruct high-resolution images from multiple lowresolution images, and the other is to reconstruct high-resolution images from single low-resolution images (SISR) [2]....

    [...]

References
More filters
Book
01 Jan 1956
TL;DR: This is the revision of the classic text in the field, adding two new chapters and thoroughly updating all others as discussed by the authors, and the original structure is retained, and the book continues to serve as a combined text/reference.
Abstract: This is the revision of the classic text in the field, adding two new chapters and thoroughly updating all others. The original structure is retained, and the book continues to serve as a combined text/reference.

35,552 citations

Proceedings ArticleDOI
21 Jul 2017
TL;DR: SRGAN as mentioned in this paper proposes a perceptual loss function which consists of an adversarial loss and a content loss, which pushes the solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images.
Abstract: Despite the breakthroughs in accuracy and speed of single image super-resolution using faster and deeper convolutional neural networks, one central problem remains largely unsolved: how do we recover the finer texture details when we super-resolve at large upscaling factors? The behavior of optimization-based super-resolution methods is principally driven by the choice of the objective function. Recent work has largely focused on minimizing the mean squared reconstruction error. The resulting estimates have high peak signal-to-noise ratios, but they are often lacking high-frequency details and are perceptually unsatisfying in the sense that they fail to match the fidelity expected at the higher resolution. In this paper, we present SRGAN, a generative adversarial network (GAN) for image super-resolution (SR). To our knowledge, it is the first framework capable of inferring photo-realistic natural images for 4x upscaling factors. To achieve this, we propose a perceptual loss function which consists of an adversarial loss and a content loss. The adversarial loss pushes our solution to the natural image manifold using a discriminator network that is trained to differentiate between the super-resolved images and original photo-realistic images. In addition, we use a content loss motivated by perceptual similarity instead of similarity in pixel space. Our deep residual network is able to recover photo-realistic textures from heavily downsampled images on public benchmarks. An extensive mean-opinion-score (MOS) test shows hugely significant gains in perceptual quality using SRGAN. The MOS scores obtained with SRGAN are closer to those of the original high-resolution images than to those obtained with any state-of-the-art method.

6,884 citations

Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a deep learning method for single image super-resolution (SR), which directly learns an end-to-end mapping between the low/high-resolution images.
Abstract: We propose a deep learning method for single image super-resolution (SR). Our method directly learns an end-to-end mapping between the low/high-resolution images. The mapping is represented as a deep convolutional neural network (CNN) that takes the low-resolution image as the input and outputs the high-resolution one. We further show that traditional sparse-coding-based SR methods can also be viewed as a deep convolutional network. But unlike traditional methods that handle each component separately, our method jointly optimizes all layers. Our deep CNN has a lightweight structure, yet demonstrates state-of-the-art restoration quality, and achieves fast speed for practical on-line usage. We explore different network structures and parameter settings to achieve trade-offs between performance and speed. Moreover, we extend our network to cope with three color channels simultaneously, and show better overall reconstruction quality.

6,122 citations

Journal ArticleDOI
TL;DR: This paper presents a new approach to single-image superresolution, based upon sparse signal representation, which generates high-resolution images that are competitive or even superior in quality to images produced by other similar SR methods.
Abstract: This paper presents a new approach to single-image superresolution, based upon sparse signal representation. Research on image statistics suggests that image patches can be well-represented as a sparse linear combination of elements from an appropriately chosen over-complete dictionary. Inspired by this observation, we seek a sparse representation for each patch of the low-resolution input, and then use the coefficients of this representation to generate the high-resolution output. Theoretical results from compressed sensing suggest that under mild conditions, the sparse representation can be correctly recovered from the downsampled signals. By jointly training two dictionaries for the low- and high-resolution image patches, we can enforce the similarity of sparse representations between the low-resolution and high-resolution image patch pair with respect to their own dictionaries. Therefore, the sparse representation of a low-resolution image patch can be applied with the high-resolution image patch dictionary to generate a high-resolution image patch. The learned dictionary pair is a more compact representation of the patch pairs, compared to previous approaches, which simply sample a large amount of image patch pairs , reducing the computational cost substantially. The effectiveness of such a sparsity prior is demonstrated for both general image super-resolution (SR) and the special case of face hallucination. In both cases, our algorithm generates high-resolution images that are competitive or even superior in quality to images produced by other similar SR methods. In addition, the local sparse modeling of our approach is naturally robust to noise, and therefore the proposed algorithm can handle SR with noisy inputs in a more unified framework.

4,958 citations


"Are Recent SISR Techniques Suitable..." refers background or methods in this paper

  • ...In the patch-by-patch reconstruction process of traditional learning-based SR methods [9], [13], [16], [30], different...

    [...]

  • ...However, does PSNR really work well for SISR? In many SISR approaches [9], [11], [31], researchers have observed that PSNR values are often inconsistent with the subjective quality....

    [...]

  • ...els have been proposed, such as neighbor-embedding-based methods [7], [8], sparse-representation-based methods [9]– [11], and local self-similarity-based methods [12]....

    [...]

Proceedings ArticleDOI
27 Jun 2016
TL;DR: This paper presents the first convolutional neural network capable of real-time SR of 1080p videos on a single K2 GPU and introduces an efficient sub-pixel convolution layer which learns an array of upscaling filters to upscale the final LR feature maps into the HR output.
Abstract: Recently, several models based on deep neural networks have achieved great success in terms of both reconstruction accuracy and computational performance for single image super-resolution. In these methods, the low resolution (LR) input image is upscaled to the high resolution (HR) space using a single filter, commonly bicubic interpolation, before reconstruction. This means that the super-resolution (SR) operation is performed in HR space. We demonstrate that this is sub-optimal and adds computational complexity. In this paper, we present the first convolutional neural network (CNN) capable of real-time SR of 1080p videos on a single K2 GPU. To achieve this, we propose a novel CNN architecture where the feature maps are extracted in the LR space. In addition, we introduce an efficient sub-pixel convolution layer which learns an array of upscaling filters to upscale the final LR feature maps into the HR output. By doing so, we effectively replace the handcrafted bicubic filter in the SR pipeline with more complex upscaling filters specifically trained for each feature map, whilst also reducing the computational complexity of the overall SR operation. We evaluate the proposed approach using images and videos from publicly available datasets and show that it performs significantly better (+0.15dB on Images and +0.39dB on Videos) and is an order of magnitude faster than previous CNN-based methods.

4,770 citations


"Are Recent SISR Techniques Suitable..." refers methods in this paper

  • ...the VGGNet-model-based VDSR [18], the residual network (ResNet)-based SR-ResNet [19], and the efficient pixel shuffling network [20]....

    [...]