scispace - formally typeset
Search or ask a question
Author

Avinash Upadhyay

Bio: Avinash Upadhyay is an academic researcher from Council of Scientific and Industrial Research. The author has contributed to research in topics: RGB color model & Image resolution. The author has an hindex of 5, co-authored 7 publications receiving 64 citations. Previous affiliations of Avinash Upadhyay include Central Electronics Engineering Research Institute.

Papers
More filters
Proceedings ArticleDOI
01 Jun 2018
TL;DR: This work proposes a 2D convolution neural network and a 3D convolved neural network based approaches for hyperspectral image reconstruction from RGB images that achieves very good performance in terms of MRAE and RMSE.
Abstract: Hyperspectral cameras are used to preserve fine spectral details of scenes that are not captured by traditional RGB cameras that comprehensively quantizes radiance in RGB images. Spectral details provide additional information that improves the performance of numerous image based analytic applications, but due to high hyperspectral hardware cost and associated physical constraints, hyperspectral images are not easily available for further processing. Motivated by the performance of deep learning for various computer vision applications, we propose a 2D convolution neural network and a 3D convolution neural network based approaches for hyperspectral image reconstruction from RGB images. A 2D-CNN model primarily focuses on extracting spectral data by considering only spatial correlation of the channels in the image, while in 3D-CNN model the inter-channel co-relation is also exploited to refine the extraction of spectral data. Our 3D-CNN based architecture achieves very good performance in terms of MRAE and RMSE. In contrast to 3D-CNN, our 2D-CNN based architecture also achieves comparable performance with very less computational complexity.

50 citations

Proceedings ArticleDOI
16 Jun 2019
TL;DR: This paper reviews the NTIRE challenge on image colorization (estimating color information from the corresponding gray image) with focus on proposed solutions and results.
Abstract: This paper reviews the NTIRE challenge on image colorization (estimating color information from the corresponding gray image) with focus on proposed solutions and results. It is the first challenge of its kind. The challenge had 2 tracks. Track 1 takes a single gray image as input. In Track 2, in addition to the gray input image, some color seeds (randomly samples from the latent color image) are also provided for guiding the colorization process. The operators were learnable through provided pairs of gray and color training images. The tracks had 188 registered participants, and 8 teams competed in the final testing phase.

22 citations

Proceedings ArticleDOI
01 Sep 2019
TL;DR: An end-to-end trainable deep-learning based framework for joint optimization of document enhancement and recognition, using a generative adversarial network (GAN) based framework to perform image denoising followed by deep back projection network (DBPN) for super-resolution.
Abstract: Recognizing text from degraded and low-resolution document images is still an open challenge in the vision community. Existing text recognition systems require a certain resolution and fails if the document is of low-resolution or heavily degraded or noisy. This paper presents an end-to-end trainable deep-learning based framework for joint optimization of document enhancement and recognition. We are using a generative adversarial network (GAN) based framework to perform image denoising followed by deep back projection network (DBPN) for super-resolution and use these super-resolved features to train a bidirectional long short term memory (BLSTM) with Connectionist Temporal Classification (CTC) for recognition of textual sequences. The entire network is end-to-end trainable and we obtain improved results than state-of-the-art for both the image enhancement and document recognition tasks. We demonstrate results on both printed and handwritten degraded document datasets to show the generalization capability of our proposed robust framework.

12 citations

Proceedings ArticleDOI
16 Jun 2019
TL;DR: A Robust Image Colorization using self-attention based Progressive Generative Adversarial Network (RIC-SPGAN) which consists of residual encoder-decoder (RED) network and a Self-att attention based progressive Generative network (SP-GAN) in a cascaded form to perform the denoising and colorization of the image.
Abstract: Automatic image colorization is a very interesting computer graphics problem wherein an input grayscale image is transformed into its RGB domain. However, it is an ill-posed problem as there can be multiple RGB outcomes for a particular grayscale pixel. The problem further complicates if noise is present in the grayscale image. In this paper, we propose a Robust Image Colorization using Self-attention based Progressive Generative Adversarial Network (RIC-SPGAN) which consists of residual encoder-decoder (RED) network and a Self-attention based progressive Generative network (SP-GAN) in a cascaded form to perform the denoising and colorization of the image. We have used self-attention based progressive network to model the long-range dependencies and gradually enhanced the resolution of the colorized image for faster, stable and variation rich features for generation of the image. We also presented the stabilization technique of the presented generative model. Our model has shown exceptional perceptual results on noisy and normal grayscale images. We have trained our model on ILSVRC2012. The visual results on images of DIV2K with and without noise has been presented in the paper along with the failure cases of the model.

7 citations

Proceedings ArticleDOI
18 Jun 2018
TL;DR: A novel Improved Residual based Gradual Up-Scaling Network (IRGUN) to improve the quality of the super-resolved image for a large magnification factor and recovers fine details effectively at large (8X) magnification factors.
Abstract: Convolutional neural network based architectures have achieved decent perceptual quality super resolution on natural images for small scaling factors (2X and 4X). However, image super-resolution for large magnication factors (8X) is an extremely challenging problem for the computer vision community. In this paper, we propose a novel Improved Residual based Gradual Up-Scaling Network (IRGUN) to improve the quality of the super-resolved image for a large magnification factor. IRGUN has a Gradual Upsampling and Residue-based Enhancment Network (GUREN) which comprises of series of Up-scaling and Enhancement blocks (UEB) connected end-to-end and fine-tuned together to give a gradual magnification and enhancement. Due to the perceptual importance of the luminance in super-resolution, the model is trained on luminance (Y) channel of the YCbCr image. Whereas, the chrominance components (Cb and Cr) channel are up-scaled using bicubic interpolation and combined with super-resolved Y channel of the image, which is then converted to RGB. A cascaded 3D-RED architecture trained on RGB images is utilized to incorporate its inter-channel correlation. In addition to this, the training methodology is also presented in the paper. In the training procedure, the weights of the previous UEB are used in the next immediate UEB for faster and better convergence. Each UEB is trained on its respective scale by taking the output image of the previous UEB as input and corresponding HR image of the same scale as ground truth to the successive UEB. All the UEBs are then connected end-to-end and fine tuned. The IRGUN recovers fine details effectively at large (8X) magnification factors. The efficiency of IRGUN is presented on various benchmark datasets and at different magnification scales.

6 citations


Cited by
More filters
Proceedings ArticleDOI
18 Jun 2018
TL;DR: This paper reviews the 2nd NTIRE challenge on single image super-resolution (restoration of rich details in a low resolution image) with focus on proposed solutions and results and gauges the state-of-the-art in single imagesuper-resolution.
Abstract: This paper reviews the 2nd NTIRE challenge on single image super-resolution (restoration of rich details in a low resolution image) with focus on proposed solutions and results. The challenge had 4 tracks. Track 1 employed the standard bicubic downscaling setup, while Tracks 2, 3 and 4 had realistic unknown downgrading operators simulating camera image acquisition pipeline. The operators were learnable through provided pairs of low and high resolution train images. The tracks had 145, 114, 101, and 113 registered participants, resp., and 31 teams competed in the final testing phase. They gauge the state-of-the-art in single image super-resolution.

298 citations

Proceedings ArticleDOI
01 Oct 2019
TL;DR: The λ-net, which reconstructs hyperspectral images from a single shot measurement, can finish the reconstruction task within sub-seconds instead of hours taken by the most recently proposed DeSCI algorithm, thus speeding up the reconstruction >1000 times.
Abstract: We propose the λ-net, which reconstructs hyperspectral images (e.g., with 24 spectral channels) from a single shot measurement. This task is usually termed snapshot compressive-spectral imaging (SCI), which enjoys low cost, low bandwidth and high-speed sensing rate via capturing the three-dimensional (3D) signal i.e., (x, y, λ), using a 2D snapshot. Though proposed more than a decade ago, the poor quality and low-speed of reconstruction algorithms preclude wide applications of SCI. To address this challenge, in this paper, we develop a dual-stage generative model to reconstruct the desired 3D signal in SCI, dubbed λ-net. Results on both simulation and real datasets demonstrate the significant advantages of λ-net, which leads to >4dB improvement in PSNR for real-mask-in-the-loop simulation data compared to the current state-of-the-art. Furthermore, λ-net can finish the reconstruction task within sub-seconds instead of hours taken by the most recently proposed DeSCI algorithm, thus speeding up the reconstruction >1000 times.

149 citations

Journal ArticleDOI
TL;DR: This paper proposes a fully automatic image colorization method for grayscale images using neural network and optimization, and presents a cost function to formalize the premise that neighboring pixels should have the maximum positive similarity of intensities and colors.
Abstract: In this paper, we propose a fully automatic image colorization method for grayscale images using neural network and optimization. For a determined training set including the gray images and its corresponding color images, our method segments grayscale images into superpixels and then extracts features of particular points of interest in each superpixel. The obtained features and their RGB values are given as input for, the training colorization neural network of each pixel. To achieve a better image colorization effect in shorter running time, our method further propagates the resulting color points to neighboring pixels for improved colorization results. In the propagation of color, we present a cost function to formalize the premise that neighboring pixels should have the maximum positive similarity of intensities and colors; we then propose our solution to solving the optimization problem. At last, a guided image filter is employed to refine the colorized image. Experiments on a wide variety of images show that the proposed algorithms can achieve superior performance over the state-of-the-art algorithms.

129 citations

Proceedings ArticleDOI
01 Jan 2018
TL;DR: This paper reviews the first challenge on spectral image reconstruction from RGB images, i.e., the recovery of whole-scene hyperspectral (HS) information from a 3-channel RGB image.
Abstract: This paper reviews the first challenge on spectral image reconstruction from RGB images, i.e., the recovery of whole-scene hyperspectral (HS) information from a 3-channel RGB image. The challenge was divided into 2 tracks: the "Clean" track sought HS recovery from noiseless RGB images obtained from a known response function (representing spectrally-calibrated camera) while the "Real World" track challenged participants to recover HS cubes from JPEG-compressed RGB images generated by an unknown response function. To facilitate the challenge, the BGU Hyperspectral Image Database [4] was extended to provide participants with 256 natural HS training images, and 5+10 additional images for validation and testing, respectively. The "Clean" and "Real World" tracks had 73 and 63 registered participants respectively, with 12 teams competing in the final testing phase. Proposed methods and their corresponding results are reported in this review.

128 citations