scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

2D-3D CNN Based Architectures for Spectral Reconstruction from RGB Images

TL;DR: This work proposes a 2D convolution neural network and a 3D convolved neural network based approaches for hyperspectral image reconstruction from RGB images that achieves very good performance in terms of MRAE and RMSE.
Abstract: Hyperspectral cameras are used to preserve fine spectral details of scenes that are not captured by traditional RGB cameras that comprehensively quantizes radiance in RGB images. Spectral details provide additional information that improves the performance of numerous image based analytic applications, but due to high hyperspectral hardware cost and associated physical constraints, hyperspectral images are not easily available for further processing. Motivated by the performance of deep learning for various computer vision applications, we propose a 2D convolution neural network and a 3D convolution neural network based approaches for hyperspectral image reconstruction from RGB images. A 2D-CNN model primarily focuses on extracting spectral data by considering only spatial correlation of the channels in the image, while in 3D-CNN model the inter-channel co-relation is also exploited to refine the extraction of spectral data. Our 3D-CNN based architecture achieves very good performance in terms of MRAE and RMSE. In contrast to 3D-CNN, our 2D-CNN based architecture also achieves comparable performance with very less computational complexity.

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI
01 Oct 2019
TL;DR: The λ-net, which reconstructs hyperspectral images from a single shot measurement, can finish the reconstruction task within sub-seconds instead of hours taken by the most recently proposed DeSCI algorithm, thus speeding up the reconstruction >1000 times.
Abstract: We propose the λ-net, which reconstructs hyperspectral images (e.g., with 24 spectral channels) from a single shot measurement. This task is usually termed snapshot compressive-spectral imaging (SCI), which enjoys low cost, low bandwidth and high-speed sensing rate via capturing the three-dimensional (3D) signal i.e., (x, y, λ), using a 2D snapshot. Though proposed more than a decade ago, the poor quality and low-speed of reconstruction algorithms preclude wide applications of SCI. To address this challenge, in this paper, we develop a dual-stage generative model to reconstruct the desired 3D signal in SCI, dubbed λ-net. Results on both simulation and real datasets demonstrate the significant advantages of λ-net, which leads to >4dB improvement in PSNR for real-mask-in-the-loop simulation data compared to the current state-of-the-art. Furthermore, λ-net can finish the reconstruction task within sub-seconds instead of hours taken by the most recently proposed DeSCI algorithm, thus speeding up the reconstruction >1000 times.

149 citations


Cites methods from "2D-3D CNN Based Architectures for S..."

  • ...With some modifications, we have compared λ-net with the networks developed in [35, 37, 53] for CASSI reconstruction....

    [...]

Proceedings ArticleDOI
01 Jan 2018
TL;DR: This paper reviews the first challenge on spectral image reconstruction from RGB images, i.e., the recovery of whole-scene hyperspectral (HS) information from a 3-channel RGB image.
Abstract: This paper reviews the first challenge on spectral image reconstruction from RGB images, i.e., the recovery of whole-scene hyperspectral (HS) information from a 3-channel RGB image. The challenge was divided into 2 tracks: the "Clean" track sought HS recovery from noiseless RGB images obtained from a known response function (representing spectrally-calibrated camera) while the "Real World" track challenged participants to recover HS cubes from JPEG-compressed RGB images generated by an unknown response function. To facilitate the challenge, the BGU Hyperspectral Image Database [4] was extended to provide participants with 256 natural HS training images, and 5+10 additional images for validation and testing, respectively. The "Clean" and "Real World" tracks had 73 and 63 registered participants respectively, with 12 teams competing in the final testing phase. Proposed methods and their corresponding results are reported in this review.

128 citations


Cites background from "2D-3D CNN Based Architectures for S..."

  • ...64s PyTorch Titan X Pascal CEERI [15] 2....

    [...]

  • ...CEERI team Title: RGB to Hyperspectral conversion using deep convolutional neural network Members: Manoj Sharma (sriharsharaja@gmail.com), Sriharsha Koundinya, Avinash Upadhyay, Raunak Manekar, Rudrabha Mukhopadhyay, Himanshu Sharma, Santanu Chaudhury Affiliations: CSIR-CEERI A.9....

    [...]

Book ChapterDOI
23 Aug 2020
TL;DR: This work reproduces a stable single disperser CASSI system and proposes a novel deep convolutional network to carry out the real-time reconstruction by using self-attention, employing Spatial-Spectral Self-Attention (TSA) to process each dimension sequentially, yet in an order-independent manner.
Abstract: Coded aperture snapshot spectral imaging (CASSI) is an effective tool to capture real-world 3D hyperspectral images. While a number of existing work has been conducted for hardware and algorithm design, we make a step towards the low-cost solution that enjoys video-rate high-quality reconstruction. To make solid progress on this challenging yet under-investigated task, we reproduce a stable single disperser (SD) CASSI system to gather large-scale real-world CASSI data and propose a novel deep convolutional network to carry out the real-time reconstruction by using self-attention. In order to jointly capture the self-attention across different dimensions in hyperspectral images (i.e., channel-wise spectral correlation and non-local spatial regions), we propose Spatial-Spectral Self-Attention (TSA) to process each dimension sequentially, yet in an order-independent manner. We employ TSA in an encoder-decoder network, dubbed TSA-Net, to reconstruct the desired 3D cube. Furthermore, we investigate how noise affects the results and propose to add shot noise in model training, which improves the real data results significantly. We hope our large-scale CASSI data serve as a benchmark in future research and our TSA model as a baseline in deep learning based reconstruction algorithms. Our code and data are available at https://github.com/mengziyi64/TSA-Net.

96 citations


Cites background from "2D-3D CNN Based Architectures for S..."

  • ...Inspired by the recent advances of deep learning on image restoration [24, 59, 71], researchers have started using deep learning to reconstruct hyperspectral images from RGB images [1, 19, 20, 31, 40]....

    [...]

Proceedings ArticleDOI
Jiaojiao Li1, Chaoxiong Wu1, Rui Song1, Yunsong Li1, Fei Liu1 
14 Jun 2020
TL;DR: Zhang et al. as mentioned in this paper proposed an adaptive weighted channel attention (AWCA) module to reallocate channel-wise feature responses via integrating correlations between channels, and a patch-level second-order non-local (PSNL) module is developed to capture long-range spatial contextual information.
Abstract: Recent promising effort for spectral reconstruction (SR) focuses on learning a complicated mapping through using a deeper and wider convolutional neural networks (C- NNs). Nevertheless, most CNN-based SR algorithms neglect to explore the camera spectral sensitivity (CSS) prior and interdependencies among intermediate features, thus limiting the representation ability of the network and performance of SR. To conquer these issues, we propose a novel adaptive weighted attention network (AWAN) for SR, whose backbone is stacked with multiple dual residual attention blocks (DRAB) decorating with long and short skip connections to form the dual residual learning. Concretely, we investigate an adaptive weighted channel attention (AWCA) module to reallocate channel-wise feature responses via integrating correlations between channels. Furthermore, a patch-level second-order non-local (PSNL) module is developed to capture long-range spatial contextual information by second-order non-local operations for more powerful feature representations. Based on the fact that the recovered RGB images can be projected by the reconstructed hyperspectral image (HSI) and the given CSS function, we incorporate the discrepancies of the RGB images and HSIs as a finer constraint for more accurate reconstruction. Experimental results demonstrate the effectiveness of our proposed AWAN network in terms of quantitative comparison and perceptual quality over other state-of-the-art SR methods. In the NTIRE 2020 Spectral Reconstruction Challenge, our entries obtain the 1st ranking on the "Clean" track and the 3rd place on the "`Real World" track. Codes are available at https://github.com/Deep-imagelab/AWAN.

81 citations

Proceedings ArticleDOI
Yuzhi Zhao1, Lai-Man Po1, Qiong Yan2, Wei Liu2, Tingyu Lin1 
14 Jun 2020
TL;DR: A 4-level Hierarchical Regression Network (HRNet) with PixelShuffle layer as inter-level interaction with a residual dense block to remove artifacts of real world RGB images and a residual global block to build attention mechanism for enlarging perceptive field is proposed.
Abstract: Capturing visual image with a hyperspectral camera has been successfully applied to many areas due to its narrow-band imaging technology. Hyperspectral reconstruction from RGB images denotes a reverse process of hyperspectral imaging by discovering an inverse response function. Current works mainly map RGB images directly to corresponding spectrum but do not consider context information explicitly. Moreover, the use of encoder-decoder pair in current algorithms leads to loss of information. To address these problems, we propose a 4-level Hierarchical Regression Network (HRNet) with PixelShuffle layer as inter-level interaction. Furthermore, we adopt a residual dense block to remove artifacts of real world RGB images and a residual global block to build attention mechanism for enlarging perceptive field. We evaluate proposed HRNet with other architectures and techniques by participating in NTIRE 2020 Challenge on Spectral Reconstruction from RGB Images. The HRNet is the winning method of track 2 - real world images and ranks 3rd on track 1 - clean images. Please visit the project web page this https URL to try our codes and pre-trained models.

65 citations


Cites methods from "2D-3D CNN Based Architectures for S..."

  • ...With the improvement of the scale and resolution of natural HS dataset, the training of deep learning method becomes more feasible, a number of algorithms based on convolutional neural network were proposed [21, 33]....

    [...]

  • ...The previous methods [32, 21, 33, 6, 36] mainly utilize an auto-encoder structure with residual blocks [14]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This paper addresses the problem of the classification of hyperspectral remote sensing images by support vector machines by understanding and assessing the potentialities of SVM classifiers in hyperdimensional feature spaces and concludes that SVMs are a valid and effective alternative to conventional pattern recognition approaches.
Abstract: This paper addresses the problem of the classification of hyperspectral remote sensing images by support vector machines (SVMs) First, we propose a theoretical discussion and experimental analysis aimed at understanding and assessing the potentialities of SVM classifiers in hyperdimensional feature spaces Then, we assess the effectiveness of SVMs with respect to conventional feature-reduction-based approaches and their performances in hypersubspaces of various dimensionalities To sustain such an analysis, the performances of SVMs are compared with those of two other nonparametric classifiers (ie, radial basis function neural networks and the K-nearest neighbor classifier) Finally, we study the potentially critical issue of applying binary SVMs to multiclass problems in hyperspectral data In particular, four different multiclass strategies are analyzed and compared: the one-against-all, the one-against-one, and two hierarchical tree-based strategies Different performance indicators have been used to support our experimental studies in a detailed and accurate way, ie, the classification accuracy, the computational time, the stability to parameter setting, and the complexity of the multiclass architecture The results obtained on a real Airborne Visible/Infrared Imaging Spectroradiometer hyperspectral dataset allow to conclude that, whatever the multiclass strategy adopted, SVMs are a valid and effective alternative to conventional pattern recognition approaches (feature-reduction procedures combined with a classification method) for the classification of hyperspectral remote sensing data

3,607 citations


"2D-3D CNN Based Architectures for S..." refers background in this paper

  • ...Every matter has different spectral characteristics, capturing the differences in these characteristics can be of critical importance in a wide variety of applications like medical imaging [12] [31][36], remote sensing [6][7][9][27][38] and forensics[19]....

    [...]

Journal ArticleDOI
07 Jun 1985-Science
TL;DR: The initial results show that remote, direct identification of surface materials on a picture-element basis can be accomplished by proper sampling of absorption features in the reflectance spectrum.
Abstract: Imaging spectrometry, a new technique for the remote sensing of the earth, is now technically feasible from aircraft and spacecraft. The initial results show that remote, direct identification of surface materials on a picture-element basis can be accomplished by proper sampling of absorption features in the reflectance spectrum. The airborne and spaceborne sensors are capable of acquiring images simultaneously in 100 to 200 contiguous spectral bands. The ability to acquire laboratory-like spectra remotely is a major advance in remote sensing capability. Concomitant advances in computer technology for the reduction and storage of such potentially massive data sets are at hand, and new analytic techniques are being developed to extract the full information content of the data. The emphasis on the deterministic approach to multispectral data analysis as opposed to the statistical approaches used in the past should stimulate the development of new digital image-processing methodologies.

1,750 citations


"2D-3D CNN Based Architectures for S..." refers background in this paper

  • ...Hyperspectral images are used in remote sensing application for more than three decades [17]....

    [...]

Journal ArticleDOI
TL;DR: The Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) was the first imaging sensor to measure the solar reflected spectrum from 400 nm to 2500 nm at 10 nm intervals as mentioned in this paper.

1,729 citations


"2D-3D CNN Based Architectures for S..." refers methods in this paper

  • ...NASA’s AVIRIS[18] uses hyperspectral imaging systems which acquire images by using ’whisk broom’ scanning method....

    [...]

Journal ArticleDOI
TL;DR: An overview of the literature on medical hyperspectral imaging technology and its applications is presented, an introduction for those new to the field, an overview for those working in the field and a reference for those searching for literature on a specific application are presented.
Abstract: Hyperspectral imaging (HSI) is an emerging imaging modality for medical applications, especially in disease diagnosis and image-guided surgery. HSI acquires a three-dimensional dataset called hypercube, with two spatial dimensions and one spectral dimension. Spatially resolved spectral imaging obtained by HSI provides diagnostic information about the tissue physiology, morphology, and composition. This review paper presents an overview of the literature on medical hyperspectral imaging technology and its applications. The aim of the survey is threefold: an introduction for those new to the field, an overview for those working in the field, and a reference for those searching for literature on a specific application.

1,605 citations


"2D-3D CNN Based Architectures for S..." refers background in this paper

  • ...Further hyperspectral imaging requires high spectral resolution, it also needs more exposure time to create a noiseless hyperspectral image[26]....

    [...]

Journal ArticleDOI
TL;DR: A comprehensive optimization method to arrive at the spatial and spectral layout of the color filter array of a GAP camera is presented and a novel algorithm for reconstructing the under-sampled channels of the image while minimizing aliasing artifacts is developed.
Abstract: We propose the concept of a generalized assorted pixel (GAP) camera, which enables the user to capture a single image of a scene and, after the fact, control the tradeoff between spatial resolution, dynamic range and spectral detail. The GAP camera uses a complex array (or mosaic) of color filters. A major problem with using such an array is that the captured image is severely under-sampled for at least some of the filter types. This leads to reconstructed images with strong aliasing. We make four contributions in this paper: 1) we present a comprehensive optimization method to arrive at the spatial and spectral layout of the color filter array of a GAP camera. 2) We develop a novel algorithm for reconstructing the under-sampled channels of the image while minimizing aliasing artifacts. 3) We demonstrate how the user can capture a single image and then control the tradeoff of spatial resolution to generate a variety of images, including monochrome, high dynamic range (HDR) monochrome, RGB, HDR RGB, and multispectral images. 4) Finally, the performance of our GAP camera has been verified using extensive simulations that use multispectral images of real world scenes. A large database of these multispectral images has been made available at http://wwwl.cs.columbia.edu/ CAVE/projects/gap_camera/ for use by the research community.

833 citations


"2D-3D CNN Based Architectures for S..." refers methods in this paper

  • ...2D-3D CNN model are tested and trained using NTIRE2018 image dataset[4], CAVE [43] and iCVL dataset [3]....

    [...]

  • ...We evaluated the proposed approach using the hyperspectral images from NTIRE-2018[4], iCVL[3] and CAVE [43] databases....

    [...]

  • ...We have also shown the relative performance of the proposed methods compared to other CNN based algorithms in Table 2 and Table 3 on CAVE and iCVL datasets respectively....

    [...]

  • ...CAVE dataset consists 32 hyperspectral images taken by Apogee Alta U260 camera size of each image is512× 512, it also consists of 31 spectral bands with range 400-700nm with 10nm steps....

    [...]

  • ...Individual models have been trained on distinct datasets [4][3][43]....

    [...]