scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Deep learning based frameworks for image super-resolution and noise-resilient super-resolution

TL;DR: Experimental results show that proposed noise resilient super-resolution framework outperforms the conventional and state-of-the-art approaches in terms of PSNR and SSIM metrics.
Abstract: Our paper is motivated from the advancement in deep learning algorithms for various computer vision problems. We are proposing a novel end-to-end deep learning based framework for image super-resolution. This framework simultaneously calculates the convolutional features of low-resolution (LR) and high-resolution (HR) image patches and learns the non-linear function that maps these convolutional features of LR image patches to their corresponding HR image patches convolutional features. Here, proposed deep learning based image super-resolution architecture is termed as coupled deep convolutional auto-encoder (CDCA) which provides state-of-the-art results. Super-resolution of a noisy/distorted LR images results in noisy/distorted HR images, as super-resolution process gives rise to spatial correlation in the noise, and further, it cannot be de-noised successfully. Traditional noise resilient image super-resolution methods utilize a de-noising algorithm prior to super-resolution but de-noising process gives rise to loss of some high-frequency information (edges and texture details) and super-resolution of the resultant image provides HR image with missing edges and texture information. We are also proposing a novel end-to-end deep learning based framework to obtain noise resilient image super-resolution. Proposed end-to-end deep learning based framework for noise resilient super-resolution simultaneously perform image de-noising and super-resolution as well as preserves textural details. First, stacked sparse de-noising auto-encoder (SSDA) was learned for LR image de-noising and proposed CDCA was learned for image superresolution. Then, both image de-noising and super-resolution networks were cascaded. This cascaded deep learning network was employed as one integral network where pre-trained weights were serving as initial weights. The integral network was end-to-end trained or fine-tuned on a database having noisy, LR image as an input and target as an HR image. In fine-tuning, all layers of the combined end-to-end network was jointly optimized to perform image de-noising and super-resolution simultaneously. Experimental results show that proposed noise resilient super-resolution framework outperforms the conventional and state-of-the-art approaches in terms of PSNR and SSIM metrics.
Citations
More filters
Journal ArticleDOI
TL;DR: A new convolutional generator model is proposed to super-resolve low-resolution (LR) remote sensing data from an unsupervised perspective and is able to initially learn relationships between the LR and HR domains throughout several convolutionals, downsampling, batch normalization, and activation layers.
Abstract: Super-resolution (SR) brings an excellent opportunity to improve a wide range of different remote sensing applications. SR techniques are concerned about increasing the image resolution while providing finer spatial details than those captured by the original acquisition instrument. Therefore, SR techniques are particularly useful to cope with the increasing demand remote sensing imaging applications requiring fine spatial resolution. Even though different machine learning paradigms have been successfully applied in SR, more research is required to improve the SR process without the need of external high-resolution (HR) training examples. This paper proposes a new convolutional generator model to super-resolve low-resolution (LR) remote sensing data from an unsupervised perspective. That is, the proposed generative network is able to initially learn relationships between the LR and HR domains throughout several convolutional, downsampling, batch normalization, and activation layers. Then, the data are symmetrically projected to the target resolution while guaranteeing a reconstruction constraint over the LR input image. An experimental comparison is conducted using 12 different unsupervised SR methods over different test images. Our experiments reveal the potential of the proposed approach to improve the resolution of remote sensing imagery.

115 citations

Book ChapterDOI
08 Sep 2018
TL;DR: A novel automated pipeline leveraging deep convolutional neural networks for stomata detection and its quantification shows a superior performance in contrast to the existing stomATA detection methods in terms of precision and recall.
Abstract: Analysis of stomata density and its configuration based on scanning electron microscopic (SEM) image of a leaf surface, is an effective way to characterize the plant’s behaviour under various environmental stresses (drought, salinity etc.). Existing methods for phenotyping these stomatal traits are often based on manual or semi-automatic labeling and segmentation of SEM images. This is a low-throughput process when large number of SEM images is investigated for statistical analysis. To overcome this limitation, we propose a novel automated pipeline leveraging deep convolutional neural networks for stomata detection and its quantification. The proposed framework shows a superior performance in contrast to the existing stomata detection methods in terms of precision and recall, 0.91 and 0.89 respectively. Furthermore, the morphological traits (i.e. length & width) obtained at stomata quantification step shows a correlation of 0.95 and 0.91 with manually computed traits, resulting in an efficient and high-throughput solution for stomata phenotyping.

15 citations

Patent
Zhaowen Wang1
13 Oct 2017
TL;DR: In this paper, the neural network determines a network parameter based on the noise reduction level and removes one or more of the noise artifacts from the low resolution image during the converting by the using the network parameter.
Abstract: Systems and techniques for converting a low resolution image to a high resolution image include receiving a low resolution image having one or more noise artifacts at a neural network. A noise reduction level is received at the neural network. The neural network determines a network parameter based on the noise reduction level. The neural network converts the low resolution image to a high resolution image and removes one or more of the noise artifacts from the low resolution image during the converting by the using the network parameter. The neural network outputs the high resolution image.

6 citations

Proceedings ArticleDOI
18 Jun 2018
TL;DR: A novel Improved Residual based Gradual Up-Scaling Network (IRGUN) to improve the quality of the super-resolved image for a large magnification factor and recovers fine details effectively at large (8X) magnification factors.
Abstract: Convolutional neural network based architectures have achieved decent perceptual quality super resolution on natural images for small scaling factors (2X and 4X). However, image super-resolution for large magnication factors (8X) is an extremely challenging problem for the computer vision community. In this paper, we propose a novel Improved Residual based Gradual Up-Scaling Network (IRGUN) to improve the quality of the super-resolved image for a large magnification factor. IRGUN has a Gradual Upsampling and Residue-based Enhancment Network (GUREN) which comprises of series of Up-scaling and Enhancement blocks (UEB) connected end-to-end and fine-tuned together to give a gradual magnification and enhancement. Due to the perceptual importance of the luminance in super-resolution, the model is trained on luminance (Y) channel of the YCbCr image. Whereas, the chrominance components (Cb and Cr) channel are up-scaled using bicubic interpolation and combined with super-resolved Y channel of the image, which is then converted to RGB. A cascaded 3D-RED architecture trained on RGB images is utilized to incorporate its inter-channel correlation. In addition to this, the training methodology is also presented in the paper. In the training procedure, the weights of the previous UEB are used in the next immediate UEB for faster and better convergence. Each UEB is trained on its respective scale by taking the output image of the previous UEB as input and corresponding HR image of the same scale as ground truth to the successive UEB. All the UEBs are then connected end-to-end and fine tuned. The IRGUN recovers fine details effectively at large (8X) magnification factors. The efficiency of IRGUN is presented on various benchmark datasets and at different magnification scales.

6 citations

Book ChapterDOI
16 Dec 2017
TL;DR: This proposed framework for image inpainting provides more visually plausible and better resultant image in comparison of other conventional and state-of-the-art noise-resilient super-resolution algorithms.
Abstract: Image inpainting is an extremely challenging and open problem for the computer vision community. Motivated by the recent advancement in deep learning algorithms for computer vision applications, we propose a new end-to-end deep learning based framework for image inpainting. Firstly, the images are down-sampled as it reduces the targeted area of inpainting therefore enabling better filling of the target region. A down-sampled image is inpainted using a trained deep convolutional auto-encoder (CAE). A coupled deep convolutional auto-encoder (CDCA) is also trained for natural image super resolution. The pre-trained weights from both of these networks serve as initial weights to an end-to-end framework during the fine tuning phase. Hence, the network is jointly optimized for both the aforementioned tasks while maintaining the local structure/information. We tested this proposed framework with various existing image inpainting datasets and it outperforms existing natural image blind inpainting algorithms. Our proposed framework also works well to get noise resilient super-resolution after fine-tuning on noise-free super-resolution dataset. It provides more visually plausible and better resultant image in comparison of other conventional and state-of-the-art noise-resilient super-resolution algorithms.

2 citations

References
More filters
Journal ArticleDOI
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Abstract: The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the 5 years of the challenge, and propose future directions and improvements.

30,811 citations

Journal ArticleDOI
TL;DR: A novel algorithm for adapting dictionaries in order to achieve sparse signal representations, the K-SVD algorithm, an iterative method that alternates between sparse coding of the examples based on the current dictionary and a process of updating the dictionary atoms to better fit the data.
Abstract: In recent years there has been a growing interest in the study of sparse representation of signals. Using an overcomplete dictionary that contains prototype signal-atoms, signals are described by sparse linear combinations of these atoms. Applications that use sparse representation are many and include compression, regularization in inverse problems, feature extraction, and more. Recent activity in this field has concentrated mainly on the study of pursuit algorithms that decompose signals with respect to a given dictionary. Designing dictionaries to better fit the above model can be done by either selecting one from a prespecified set of linear transforms or adapting the dictionary to a set of training signals. Both of these techniques have been considered, but this topic is largely still open. In this paper we propose a novel algorithm for adapting dictionaries in order to achieve sparse signal representations. Given a set of training signals, we seek the dictionary that leads to the best representation for each member in this set, under strict sparsity constraints. We present a new method-the K-SVD algorithm-generalizing the K-means clustering process. K-SVD is an iterative method that alternates between sparse coding of the examples based on the current dictionary and a process of updating the dictionary atoms to better fit the data. The update of the dictionary columns is combined with an update of the sparse representations, thereby accelerating convergence. The K-SVD algorithm is flexible and can work with any pursuit method (e.g., basis pursuit, FOCUSS, or matching pursuit). We analyze this algorithm and demonstrate its results both on synthetic tests and in applications on real image data

8,905 citations

Journal ArticleDOI
TL;DR: An algorithm based on an enhanced sparse representation in transform domain based on a specially developed collaborative Wiener filtering achieves state-of-the-art denoising performance in terms of both peak signal-to-noise ratio and subjective visual quality.
Abstract: We propose a novel image denoising strategy based on an enhanced sparse representation in transform domain. The enhancement of the sparsity is achieved by grouping similar 2D image fragments (e.g., blocks) into 3D data arrays which we call "groups." Collaborative Altering is a special procedure developed to deal with these 3D groups. We realize it using the three successive steps: 3D transformation of a group, shrinkage of the transform spectrum, and inverse 3D transformation. The result is a 3D estimate that consists of the jointly filtered grouped image blocks. By attenuating the noise, the collaborative filtering reveals even the finest details shared by grouped blocks and, at the same time, it preserves the essential unique features of each individual block. The filtered blocks are then returned to their original positions. Because these blocks are overlapping, for each pixel, we obtain many different estimates which need to be combined. Aggregation is a particular averaging procedure which is exploited to take advantage of this redundancy. A significant improvement is obtained by a specially developed collaborative Wiener filtering. An algorithm based on this novel denoising strategy and its efficient implementation are presented in full detail; an extension to color-image denoising is also developed. The experimental results demonstrate that this computationally scalable algorithm achieves state-of-the-art denoising performance in terms of both peak signal-to-noise ratio and subjective visual quality.

7,912 citations


"Deep learning based frameworks for ..." refers methods in this paper

  • ...EPLL [7] and BM3D [8] were other popular algorithms for image de-noising....

    [...]

  • ...We represent BM3D+ScSR as conventional1 and SSDA+CNN as a conventional2 algorithm....

    [...]

Proceedings ArticleDOI
07 Jul 2001
TL;DR: In this paper, the authors present a database containing ground truth segmentations produced by humans for images of a wide variety of natural scenes, and define an error measure which quantifies the consistency between segmentations of differing granularities.
Abstract: This paper presents a database containing 'ground truth' segmentations produced by humans for images of a wide variety of natural scenes. We define an error measure which quantifies the consistency between segmentations of differing granularities and find that different human segmentations of the same image are highly consistent. Use of this dataset is demonstrated in two applications: (1) evaluating the performance of segmentation algorithms and (2) measuring probability distributions associated with Gestalt grouping factors as well as statistics of image region properties.

6,505 citations

Journal ArticleDOI
TL;DR: This work addresses the image denoising problem, where zero-mean white and homogeneous Gaussian additive noise is to be removed from a given image, and uses the K-SVD algorithm to obtain a dictionary that describes the image content effectively.
Abstract: We address the image denoising problem, where zero-mean white and homogeneous Gaussian additive noise is to be removed from a given image. The approach taken is based on sparse and redundant representations over trained dictionaries. Using the K-SVD algorithm, we obtain a dictionary that describes the image content effectively. Two training options are considered: using the corrupted image itself, or training on a corpus of high-quality image database. Since the K-SVD is limited in handling small image patches, we extend its deployment to arbitrary image sizes by defining a global image prior that forces sparsity over patches in every location in the image. We show how such Bayesian treatment leads to a simple and effective denoising algorithm. This leads to a state-of-the-art denoising performance, equivalent and sometimes surpassing recently published leading alternative denoising methods

5,493 citations