scispace - formally typeset
Search or ask a question

Showing papers on "Image resolution published in 2018"


Journal ArticleDOI
TL;DR: This work shows how content-aware image restoration based on deep learning extends the range of biological phenomena observable by microscopy by bypassing the trade-offs between imaging speed, resolution, and maximal light exposure that limit fluorescence imaging to enable discovery.
Abstract: Fluorescence microscopy is a key driver of discoveries in the life sciences, with observable phenomena being limited by the optics of the microscope, the chemistry of the fluorophores, and the maximum photon exposure tolerated by the sample. These limits necessitate trade-offs between imaging speed, spatial resolution, light exposure, and imaging depth. In this work we show how content-aware image restoration based on deep learning extends the range of biological phenomena observable by microscopy. We demonstrate on eight concrete examples how microscopy images can be restored even if 60-fold fewer photons are used during acquisition, how near isotropic resolution can be achieved with up to tenfold under-sampling along the axial direction, and how tubular and granular structures smaller than the diffraction limit can be resolved at 20-times-higher frame rates compared to state-of-the-art methods. All developed image restoration methods are freely available as open source software in Python, FIJI, and KNIME. Content-aware image restoration (CARE) uses deep learning to improve microscopy images. CARE bypasses the trade-offs between imaging speed, resolution, and maximal light exposure that limit fluorescence imaging to enable discovery.

694 citations


Proceedings ArticleDOI
18 Jun 2018
TL;DR: In this paper, the authors exploit the internal recurrence of information inside a single image, and train a small image-specific CNN at test time, on examples extracted solely from the input image itself.
Abstract: Deep Learning has led to a dramatic leap in SuperResolution (SR) performance in the past few years. However, being supervised, these SR methods are restricted to specific training data, where the acquisition of the low-resolution (LR) images from their high-resolution (HR) counterparts is predetermined (e.g., bicubic downscaling), without any distracting artifacts (e.g., sensor noise, image compression, non-ideal PSF, etc). Real LR images, however, rarely obey these restrictions, resulting in poor SR results by SotA (State of the Art) methods. In this paper we introduce "Zero-Shot" SR, which exploits the power of Deep Learning, but does not rely on prior training. We exploit the internal recurrence of information inside a single image, and train a small image-specific CNN at test time, on examples extracted solely from the input image itself. As such, it can adapt itself to different settings per image. This allows to perform SR of real old photos, noisy images, biological data, and other images where the acquisition process is unknown or non-ideal. On such images, our method outperforms SotA CNN-based SR methods, as well as previous unsupervised SR methods. To the best of our knowledge, this is the first unsupervised CNN-based SR method.

615 citations


Journal ArticleDOI
TL;DR: This work introduces an effective technique to enhance the images captured underwater and degraded due to the medium scattering and absorption by building on the blending of two images that are directly derived from a color-compensated and white-balanced version of the original degraded image.
Abstract: We introduce an effective technique to enhance the images captured underwater and degraded due to the medium scattering and absorption. Our method is a single image approach that does not require specialized hardware or knowledge about the underwater conditions or scene structure. It builds on the blending of two images that are directly derived from a color-compensated and white-balanced version of the original degraded image. The two images to fusion, as well as their associated weight maps, are defined to promote the transfer of edges and color contrast to the output image. To avoid that the sharp weight map transitions create artifacts in the low frequency components of the reconstructed image, we also adapt a multiscale fusion strategy. Our extensive qualitative and quantitative evaluation reveals that our enhanced images and videos are characterized by better exposedness of the dark regions, improved global contrast, and edges sharpness. Our validation also proves that our algorithm is reasonably independent of the camera settings, and improves the accuracy of several image processing applications, such as image segmentation and keypoint matching.

601 citations


Proceedings ArticleDOI
18 Jun 2018
TL;DR: Zheng et al. as discussed by the authors proposed a deep but compact convolutional network to directly reconstruct the high resolution image from the original low resolution image, which consists of three parts, which are feature extraction block, stacked information distillation blocks and reconstruction block respectively.
Abstract: Recently, deep convolutional neural networks (CNNs) have been demonstrated remarkable progress on single image super-resolution. However, as the depth and width of the networks increase, CNN-based super-resolution methods have been faced with the challenges of computational complexity and memory consumption in practice. In order to solve the above questions, we propose a deep but compact convolutional network to directly reconstruct the high resolution image from the original low resolution image. In general, the proposed model consists of three parts, which are feature extraction block, stacked information distillation blocks and reconstruction block respectively. By combining an enhancement unit with a compression unit into a distillation block, the local long and short-path features can be effectively extracted. Specifically, the proposed enhancement unit mixes together two different types of features and the compression unit distills more useful information for the sequential blocks. In addition, the proposed network has the advantage of fast execution due to the comparatively few numbers of filters per layer and the use of group convolution. Experimental results demonstrate that the proposed method is superior to the state-of-the-art methods, especially in terms of time performance. Code is available at https://github.com/Zheng222/IDN-Caffe.

567 citations


Journal ArticleDOI
01 Jul 2018-Nature
TL;DR: This ptychographic reconstruction improves the image contrast of single-atom defects in MoS2 substantially, reaching an information limit close to 5α, which corresponds to an Abbe diffraction-limited resolution of 0.39 ångström.
Abstract: Aberration-corrected optics have made electron microscopy at atomic resolution a widespread and often essential tool for characterizing nanoscale structures. Image resolution has traditionally been improved by increasing the numerical aperture of the lens (α) and the beam energy, with the state-of-the-art at 300 kiloelectronvolts just entering the deep sub-angstrom (that is, less than 0.5 angstrom) regime. Two-dimensional (2D) materials are imaged at lower beam energies to avoid displacement damage from large momenta transfers, limiting spatial resolution to about 1 angstrom. Here, by combining an electron microscope pixel-array detector with the dynamic range necessary to record the complete distribution of transmitted electrons and full-field ptychography to recover phase information from the full phase space, we increase the spatial resolution well beyond the traditional numerical-aperture-limited resolution. At a beam energy of 80 kiloelectronvolts, our ptychographic reconstruction improves the image contrast of single-atom defects in MoS2 substantially, reaching an information limit close to 5α, which corresponds to an Abbe diffraction-limited resolution of 0.39 angstrom, at the electron dose and imaging conditions for which conventional imaging methods reach only 0.98 angstrom.

441 citations


Journal ArticleDOI
TL;DR: Simulations and experimental imaging of microtubules, nuclear pores, and mitochondria show that high-quality, super-resolution images can be reconstructed from up to two orders of magnitude fewer frames than usually needed, without compromising spatial resolution.
Abstract: The speed of super-resolution microscopy methods based on single-molecule localization, for example, PALM and STORM, is limited by the need to record many thousands of frames with a small number of observed molecules in each. Here, we present ANNA-PALM, a computational strategy that uses artificial neural networks to reconstruct super-resolution views from sparse, rapidly acquired localization images and/or widefield images. Simulations and experimental imaging of microtubules, nuclear pores, and mitochondria show that high-quality, super-resolution images can be reconstructed from up to two orders of magnitude fewer frames than usually needed, without compromising spatial resolution. Super-resolution reconstructions are even possible from widefield images alone, though adding localization data improves image quality. We demonstrate super-resolution imaging of >1,000 fields of view containing >1,000 cells in ∼3 h, yielding an image spanning spatial scales from ∼20 nm to ∼2 mm. The drastic reduction in acquisition time and sample irradiation afforded by ANNA-PALM enables faster and gentler high-throughput and live-cell super-resolution imaging.

417 citations


Journal ArticleDOI
TL;DR: In the proposed CSTF method, an HR-HSI is considered as a 3D tensor and the fusion problem is redefined as the estimation of a core Tensor and dictionaries of the three modes, which demonstrates the superiority of this algorithm over the current state-of-the-art HSI-MSI fusion approaches.
Abstract: Fusing a low spatial resolution hyperspectral image (LR-HSI) with a high spatial resolution multispectral image (HR-MSI) to obtain a high spatial resolution hyperspectral image (HR-HSI) has attracted increasing interest in recent years. In this paper, we propose a coupled sparse tensor factorization (CSTF)-based approach for fusing such images. In the proposed CSTF method, we consider an HR-HSI as a 3D tensor and redefine the fusion problem as the estimation of a core tensor and dictionaries of the three modes. The high spatial-spectral correlations in the HR-HSI are modeled by incorporating a regularizer, which promotes sparse core tensors. The estimation of the dictionaries and the core tensor are formulated as a coupled tensor factorization of the LR-HSI and of the HR-MSI. Experiments on two remotely sensed HSIs demonstrate the superiority of the proposed CSTF algorithm over the current state-of-the-art HSI-MSI fusion approaches.

371 citations


Proceedings ArticleDOI
18 Jun 2018
TL;DR: In this article, a deep network is trained to predict surface normals and occlusion boundaries, which are then combined with raw depth observations provided by the RGB-D camera to solve for all pixels, including those missing in the original observation.
Abstract: The goal of our work is to complete the depth channel of an RGB-D image. Commodity-grade depth cameras often fail to sense depth for shiny, bright, transparent, and distant surfaces. To address this problem, we train a deep network that takes an RGB image as input and predicts dense surface normals and occlusion boundaries. Those predictions are then combined with raw depth observations provided by the RGB-D camera to solve for depths for all pixels, including those missing in the original observation. This method was chosen over others (e.g., inpainting depths directly) as the result of extensive experiments with a new depth completion benchmark dataset, where holes are filled in training data through the rendering of surface reconstructions created from multiview RGB-D scans. Experiments with different network inputs, depth representations, loss functions, optimization methods, inpainting methods, and deep depth estimation networks show that our proposed approach provides better depth completions than these alternatives.

353 citations


Journal ArticleDOI
09 Feb 2018-Science
TL;DR: A suite of methods to acquire atomic-resolution TEM images of several metal organic frameworks that are generally recognized as highly sensitive to electron beams and identify individual metal atomic columns, various types of surface termination, and benzene rings in the organic linkers are developed.
Abstract: High-resolution imaging of electron beam–sensitive materials is one of the most difficult applications of transmission electron microscopy (TEM). The challenges are manifold, including the acquisition of images with extremely low beam doses, the time-constrained search for crystal zone axes, the precise image alignment, and the accurate determination of the defocus value. We develop a suite of methods to fulfill these requirements and acquire atomic-resolution TEM images of several metal organic frameworks that are generally recognized as highly sensitive to electron beams. The high image resolution allows us to identify individual metal atomic columns, various types of surface termination, and benzene rings in the organic linkers. We also apply our methods to other electron beam–sensitive materials, including the organic-inorganic hybrid perovskite CH 3 NH 3 PbBr 3 .

341 citations


Journal ArticleDOI
TL;DR: In this paper, a multiscale and multidepth CNN was proposed for pan-sharpening of remote sensing images, and the proposed network yields high-resolution MS images that are superior to the compared state-of-the-art methods.
Abstract: Pan-sharpening is a fundamental and significant task in the field of remote sensing imagery processing, in which high-resolution spatial details from panchromatic images are employed to enhance the spatial resolution of multispectral (MS) images. As the transformation from low spatial resolution MS image to high-resolution MS image is complex and highly nonlinear, inspired by the powerful representation for nonlinear relationships of deep neural networks, we introduce multiscale feature extraction and residual learning into the basic convolutional neural network (CNN) architecture and propose the multiscale and multidepth CNN for the pan-sharpening of remote sensing imagery. Both the quantitative assessment results and the visual assessment confirm that the proposed network yields high-resolution MS images that are superior to the images produced by the compared state-of-the-art methods.

335 citations


Proceedings ArticleDOI
18 Jun 2018
TL;DR: In this article, an end-to-end network called Cycle-Dehaze, which does not require pairs of hazy and corresponding ground truth images for training, is presented.
Abstract: In this paper, we present an end-to-end network, called Cycle-Dehaze, for single image dehazing problem, which does not require pairs of hazy and corresponding ground truth images for training. That is, we train the network by feeding clean and hazy images in an unpaired manner. Moreover, the proposed approach does not rely on estimation of the atmospheric scattering model parameters. Our method enhances CycleGAN formulation by combining cycle-consistency and perceptual losses in order to improve the quality of textural information recovery and generate visually better haze-free images. Typically, deep learning models for dehazing take low resolution images as input and produce low resolution outputs. However, in the NTIRE 2018 challenge on single image dehazing, high resolution images were provided. Therefore, we apply bicubic downscaling. After obtaining low-resolution outputs from the network, we utilize the Laplacian pyramid to upscale the output images to the original resolution. We conduct experiments on NYU-Depth, I-HAZE, and O-HAZE datasets. Extensive experiments demonstrate that the proposed approach improves CycleGAN method both quantitatively and qualitatively.

Proceedings ArticleDOI
18 Jun 2018
TL;DR: This paper reviews the 2nd NTIRE challenge on single image super-resolution (restoration of rich details in a low resolution image) with focus on proposed solutions and results and gauges the state-of-the-art in single imagesuper-resolution.
Abstract: This paper reviews the 2nd NTIRE challenge on single image super-resolution (restoration of rich details in a low resolution image) with focus on proposed solutions and results. The challenge had 4 tracks. Track 1 employed the standard bicubic downscaling setup, while Tracks 2, 3 and 4 had realistic unknown downgrading operators simulating camera image acquisition pipeline. The operators were learnable through provided pairs of low and high resolution train images. The tracks had 145, 114, 101, and 113 registered participants, resp., and 31 teams competed in the final testing phase. They gauge the state-of-the-art in single image super-resolution.

Book ChapterDOI
08 Sep 2018
TL;DR: In this paper, the authors propose an adaptive fusion approach that leverages the complementary properties of both deep and shallow features to improve both robustness and accuracy, which significantly outperforms the top performing tracker from the challenge with a relative gain of 17% in EAO.
Abstract: In the field of generic object tracking numerous attempts have been made to exploit deep features. Despite all expectations, deep trackers are yet to reach an outstanding level of performance compared to methods solely based on handcrafted features. In this paper, we investigate this key issue and propose an approach to unlock the true potential of deep features for tracking. We systematically study the characteristics of both deep and shallow features, and their relation to tracking accuracy and robustness. We identify the limited data and low spatial resolution as the main challenges, and propose strategies to counter these issues when integrating deep features for tracking. Furthermore, we propose a novel adaptive fusion approach that leverages the complementary properties of deep and shallow features to improve both robustness and accuracy. Extensive experiments are performed on four challenging datasets. On VOT2017, our approach significantly outperforms the top performing tracker from the challenge with a relative gain of \(17\%\) in EAO.

Posted Content
TL;DR: The main idea is to rely on deep neural networks for presenting the contextual information contained in different types of land-covers and propose a pseudo-labeling and sample selection scheme for improving the transferability of deep models.
Abstract: In recent years, large amount of high spatial-resolution remote sensing (HRRS) images are available for land-cover mapping. However, due to the complex information brought by the increased spatial resolution and the data disturbances caused by different conditions of image acquisition, it is often difficult to find an efficient method for achieving accurate land-cover classification with high-resolution and heterogeneous remote sensing images. In this paper, we propose a scheme to apply deep model obtained from labeled land-cover dataset to classify unlabeled HRRS images. The main idea is to rely on deep neural networks for presenting the contextual information contained in different types of land-covers and propose a pseudo-labeling and sample selection scheme for improving the transferability of deep models. More precisely, a deep Convolutional Neural Networks is first pre-trained with a well-annotated land-cover dataset, referred to as the source data. Then, given a target image with no labels, the pre-trained CNN model is utilized to classify the image in a patch-wise manner. The patches with high confidence are assigned with pseudo-labels and employed as the queries to retrieve related samples from the source data. The pseudo-labels confirmed with the retrieved results are regarded as supervised information for fine-tuning the pre-trained deep model. To obtain a pixel-wise land-cover classification with the target image, we rely on the fine-tuned CNN and develop a hybrid classification by combining patch-wise classification and hierarchical segmentation. In addition, we create a large-scale land-cover dataset containing 150 Gaofen-2 satellite images for CNN pre-training. Experiments on multi-source HRRS images show encouraging results and demonstrate the applicability of the proposed scheme to land-cover classification.

Proceedings ArticleDOI
18 Jun 2018
TL;DR: Super-FAN as discussed by the authors integrates a sub-network for face alignment through heatmap regression and optimizes a novel heatmap loss to improve the quality of low-resolution facial images and accurately locate the facial landmarks.
Abstract: This paper addresses 2 challenging tasks: improving the quality of low resolution facial images and accurately locating the facial landmarks on such poor resolution images. To this end, we make the following 5 contributions: (a) we propose Super-FAN: the very first end-to-end system that addresses both tasks simultaneously, i.e. both improves face resolution and detects the facial landmarks. The novelty or Super-FAN lies in incorporating structural information in a GAN-based super-resolution algorithm via integrating a sub-network for face alignment through heatmap regression and optimizing a novel heatmap loss. (b) We illustrate the benefit of training the two networks jointly by reporting good results not only on frontal images (as in prior work) but on the whole spectrum of facial poses, and not only on synthetic low resolution images (as in prior work) but also on real-world images. (c) We improve upon the state-of-the-art in face super-resolution by proposing a new residual-based architecture. (d) Quantitatively, we show large improvement over the state-of-the-art for both face super-resolution and alignment. (e) Qualitatively, we show for the first time good results on real-world low resolution images like the ones of Fig. 1.

Posted ContentDOI
23 Jan 2018-bioRxiv
TL;DR: This work shows how deep learning enables biological observations beyond the physical limitations of microscopes, and illustrates how microscopy images can be restored even if 60-fold fewer photons are used during acquisition.
Abstract: Fluorescence microscopy is a key driver of discoveries in the life-sciences, with observable phenomena being limited by the optics of the microscope, the chemistry of the fluorophores, and the maximum photon exposure tolerated by the sample. These limits necessitate trade-offs between imaging speed, spatial resolution, light exposure, and imaging depth. In this work we show how deep learning enables biological observations beyond the physical limitations of microscopes. On seven concrete examples we illustrate how microscopy images can be restored even if 60-fold fewer photons are used during acquisition, how isotropic resolution can be achieved even with a 10-fold under-sampling along the axial direction, and how diffraction-limited structures can be resolved at 20-times higher frame-rates compared to state-of-the-art methods. All developed image restoration methods are freely available as open source software.

Journal ArticleDOI
TL;DR: Based on the deep convolutional neural network, a remote sensing image fusion method that can adequately extract spectral and spatial features from source images is proposed and provides better results compared with other classical methods.
Abstract: Remote sensing images with different spatial and spectral resolution, such as panchromatic (PAN) images and multispectral (MS) images, can be captured by many earth-observing satellites. Normally, PAN images possess high spatial resolution but low spectral resolution, while MS images have high spectral resolution with low spatial resolution. In order to integrate spatial and spectral information contained in the PAN and MS images, image fusion techniques are commonly adopted to generate remote sensing images at both high spatial and spectral resolution. In this study, based on the deep convolutional neural network, a remote sensing image fusion method that can adequately extract spectral and spatial features from source images is proposed. The major innovation of this study is that the proposed fusion method contains a two branches network with the deeper structure which can capture salient features of the MS and PAN images separately. Besides, the residual learning is adopted in our network to thoroughly study the relationship between the high- and low-resolution MS images. The proposed method mainly consists of two procedures. First, spatial and spectral features are respectively extracted from the MS and PAN images by convolutional layers with different depth. Second, the feature fusion procedure utilizes the extracted features from the former step to yield fused images. By evaluating the performance on the QuickBird and Gaofen-1 images, our proposed method provides better results compared with other classical methods.

Journal ArticleDOI
TL;DR: The proposed CNNs-based spatiotemporal fusion method has the following advantages: automatically extracting effective image features; learning an end-to-end mapping between MODIS and LSR Landsat images; and generating more favorable fusion results.
Abstract: We propose a novel spatiotemporal fusion method based on deep convolutional neural networks (CNNs) under the application background of massive remote sensing data. In the training stage, we build two five-layer CNNs to deal with the problems of complicated correspondence and large spatial resolution gaps between MODIS and Landsat images. Specifically, we first learn a nonlinear mapping CNN between MODIS and low-spatial-resolution (LSR) Landsat images and then learn a super-resolution CNN between LSR Landsat and original Landsat images. In the prediction stage, instead of directly taking the outputs of CNNs as the fusion result, we design a fusion model consisting of high-pass modulation and a weighting strategy to make full use of the information in prior images. Specifically, we first map the input MODIS images to transitional images via the learned nonlinear mapping CNN and further improve the transitional images to LSR Landsat images via the fusion model; then, via the learned SR CNN, the LSR Landsat images are supersolved to transitional images, which are further improved to Landsat images via the fusion model. Compared with the previous learning-based fusion methods, mainly referring to the sparse-representation-based methods, our CNNs-based spatiotemporal method has the following advantages: 1) automatically extracting effective image features; 2) learning an end-to-end mapping between MODIS and LSR Landsat images; and 3) generating more favorable fusion results. To examine the performance of the proposed fusion method, we conduct experiments on two representative Landsat–MODIS datasets by comparing with the sparse-representation-based spatiotemporal fusion model. The quantitative evaluations on all possible prediction dates and the comparison of fusion results on one key date in both visual effect and quantitative evaluations demonstrate that the proposed method can generate more accurate fusion results.

Journal ArticleDOI
TL;DR: Thorough evaluation on LFW and SCface databases shows that the proposed DCR model achieves consistently and considerably better performance than the state of the arts.
Abstract: Face images captured by surveillance cameras are often of low resolution (LR), which adversely affects the performance of their matching with high-resolution (HR) gallery images. Existing methods including super resolution, coupled mappings (CMs), multidimensional scaling, and convolutional neural network yield only modest performance. In this letter, we propose the deep coupled ResNet (DCR) model. It consists of one trunk network and two branch networks. The trunk network, trained by face images of three significantly different resolutions, is used to extract discriminative features robust to the resolution change. Two branch networks, trained by HR images and images of the targeted LR, work as resolution-specific CMs to transform HR and corresponding LR features to a space where their difference is minimized. Model parameters of branch networks are optimized using our proposed CM loss function, which considers not only the discriminability of HR and LR features, but also the similarity between them. In order to deal with various possible resolutions of probe images, we train multiple pairs of small branch networks while using the same trunk network. Thorough evaluation on LFW and SCface databases shows that the proposed DCR model achieves consistently and considerably better performance than the state of the arts.

Journal ArticleDOI
TL;DR: In this paper, a new atmospheric correction (AC) method for aquatic application of MR optical satellite imagery is presented, and demonstrated using images from the Pleiades constellation, which is used to detect turbid wake associated with the MOW1 measurement station.

Proceedings ArticleDOI
Changha Shin1, Hae-Gon Jeon2, Youngjin Yoon2, In So Kweon2, Seon Joo Kim1 
06 Apr 2018
TL;DR: Zhang et al. as discussed by the authors proposed a fast and accurate light field depth estimation method based on a fully-convolutional neural network and achieved the top rank in the HCI 4D Light Field Benchmark on most metrics, and also demonstrate the effectiveness of the proposed method on real-world light field images.
Abstract: Light field cameras capture both the spatial and the angular properties of light rays in space. Due to its property, one can compute the depth from light fields in uncontrolled lighting environments, which is a big advantage over active sensing devices. Depth computed from light fields can be used for many applications including 3D modelling and refocusing. However, light field images from hand-held cameras have very narrow baselines with noise, making the depth estimation difficult. Many approaches have been proposed to overcome these limitations for the light field depth estimation, but there is a clear trade-off between the accuracy and the speed in these methods. In this paper, we introduce a fast and accurate light field depth estimation method based on a fully-convolutional neural network. Our network is designed by considering the light field geometry and we also overcome the lack of training data by proposing light field specific data augmentation methods. We achieved the top rank in the HCI 4D Light Field Benchmark on most metrics, and we also demonstrate the effectiveness of the proposed method on real-world light-field images.

Journal ArticleDOI
TL;DR: This work develops and implements a novel approach to solving the inverse problem for single-pixel cameras efficiently and represents a significant step towards real-time operation of computational imagers.
Abstract: Single-pixel cameras capture images without the requirement for a multi-pixel sensor, enabling the use of state-of-the-art detector technologies and providing a potentially low-cost solution for sensing beyond the visible spectrum. One limitation of single-pixel cameras is the inherent trade-off between image resolution and frame rate, with current compressive (compressed) sensing techniques being unable to support real-time video. In this work we demonstrate the application of deep learning with convolutional auto-encoder networks to recover real-time 128 × 128 pixel video at 30 frames-per-second from a single-pixel camera sampling at a compression ratio of 2%. In addition, by training the network on a large database of images we are able to optimise the first layer of the convolutional network, equivalent to optimising the basis used for scanning the image intensities. This work develops and implements a novel approach to solving the inverse problem for single-pixel cameras efficiently and represents a significant step towards real-time operation of computational imagers. By learning from examples in a particular context, our approach opens up the possibility of high resolution for task-specific adaptation, with importance for applications in gas sensing, 3D imaging and metrology.

Journal ArticleDOI
TL;DR: In this paper, a Cubesat Enabled Spatio-temporal Enhancement Method (CESTEM) is proposed to correct for radiometric inconsistencies between CubeSat acquisitions, which can produce Landsat 8 consistent atmospherically corrected surface reflectances in blue, green, red, and NIR bands.

Proceedings ArticleDOI
13 Apr 2018
TL;DR: This paper makes the first attempt to solving the HSI-SR problem using an unsupervised encoder-decoder architecture that carries the following uniquenesses: it is composed of two encoder and decoder networks, coupled through a shared decoder, in order to preserve the rich spectral information from the H SI network.
Abstract: In many computer vision applications, obtaining images of high resolution in both the spatial and spectral domains are equally important. However, due to hardware limitations, one can only expect to acquire images of high resolution in either the spatial or spectral domains. This paper focuses on hyperspectral image super-resolution (HSI-SR), where a hyperspectral image (HSI) with low spatial resolution (LR) but high spectral resolution is fused with a multispectral image (MSI) with high spatial resolution (HR) but low spectral resolution to obtain HR HSI. Existing deep learning-based solutions are all supervised that would need a large training set and the availability of HR HSI, which is unrealistic. Here, we make the first attempt to solving the HSI-SR problem using an unsupervised encoder-decoder architecture that carries the following uniquenesses. First, it is composed of two encoder-decoder networks, coupled through a shared decoder, in order to preserve the rich spectral information from the HSI network. Second, the network encourages the representations from both modalities to follow a sparse Dirichlet distribution which naturally incorporates the two physical constraints of HSI and MSI. Third, the angular difference between representations are minimized in order to reduce the spectral distortion. We refer to the proposed architecture as unsupervised Sparse Dirichlet-Net, or uSDN. Extensive experimental results demonstrate the superior performance of uSDN as compared to the state-of-the-art.

Journal ArticleDOI
TL;DR: Experimental results on synthetic and real-world data sets demonstrate that the proposed method outperforms other state-of-the-art methods by a large margin in peak signal-to-noise ratio and gray-scale structural similarity indexes, which also achieves superior quality for human visual systems.
Abstract: The low spatial resolution of light-field image poses significant difficulties in exploiting its advantage. To mitigate the dependency of accurate depth or disparity information as priors for light-field image super-resolution, we propose an implicitly multi-scale fusion scheme to accumulate contextual information from multiple scales for super-resolution reconstruction. The implicitly multi-scale fusion scheme is then incorporated into bidirectional recurrent convolutional neural network, which aims to iteratively model spatial relations between horizontally or vertically adjacent sub-aperture images of light-field data. Within the network, the recurrent convolutions are modified to be more effective and flexible in modeling the spatial correlations between neighboring views. A horizontal sub-network and a vertical sub-network of the same network structure are ensembled for final outputs via stacked generalization. Experimental results on synthetic and real-world data sets demonstrate that the proposed method outperforms other state-of-the-art methods by a large margin in peak signal-to-noise ratio and gray-scale structural similarity indexes, which also achieves superior quality for human visual systems. Furthermore, the proposed method can enhance the performance of light field applications such as depth estimation.

Proceedings ArticleDOI
18 Jun 2018
TL;DR: This paper addresses the problem of enhancing the resolution of a single low-resolution image by adapting a progressive learning scheme to the deep convolutional neural network and shows that this property yields a large performance gain compared to the non-progressive learning methods.
Abstract: The problem of enhancing the resolution of a single low-resolution image has been popularly addressed by recent deep learning techniques. However, many deep learning approaches still fail to deal with extreme super-resolution scenarios because of the instability of training. In this paper, we address this issue by adapting a progressive learning scheme to the deep convolutional neural network. In detail, the overall training proceeds in multiple stages so that the model gradually increases the output image resolution. In our experiments, we show that this property yields a large performance gain compared to the non-progressive learning methods.

Journal ArticleDOI
TL;DR: It is suggested that SRCNN may become a potential solution for generating high-resolution CT images from standard CT images and significantly outperforms the linear interpolation methods for enhancing image resolution in chest CT images.
Abstract: In this study, the super-resolution convolutional neural network (SRCNN) scheme, which is the emerging deep-learning-based super-resolution method for enhancing image resolution in chest CT images, was applied and evaluated using the post-processing approach. For evaluation, 89 chest CT cases were sampled from The Cancer Imaging Archive. The 89 CT cases were divided randomly into 45 training cases and 44 external test cases. The SRCNN was trained using the training dataset. With the trained SRCNN, a high-resolution image was reconstructed from a low-resolution image, which was down-sampled from an original test image. For quantitative evaluation, two image quality metrics were measured and compared to those of the conventional linear interpolation methods. The image restoration quality of the SRCNN scheme was significantly higher than that of the linear interpolation methods (p < 0.001 or p < 0.05). The high-resolution image reconstructed by the SRCNN scheme was highly restored and comparable to the original reference image, in particular, for a ×2 magnification. These results indicate that the SRCNN scheme significantly outperforms the linear interpolation methods for enhancing image resolution in chest CT images. The results also suggest that SRCNN may become a potential solution for generating high-resolution CT images from standard CT images.

Proceedings ArticleDOI
01 Oct 2018
TL;DR: In this article, a pyramid of features extracted from a single input image is used for unsupervised monocular depth estimation on a CPU, even of an embedded system, using a pyramid-based CNN.
Abstract: Unsupervised depth estimation from a single image is a very attractive technique with several implications in robotic, autonomous navigation, augmented reality and so on. This topic represents a very challenging task and the advent of deep learning enabled to tackle this problem with excellent results. However, these architectures are extremely deep and complex. Thus, real-time performance can be achieved only by leveraging power-hungry GPUs that do not allow to infer depth maps in application fields characterized by low-power constraints. To tackle this issue, in this paper we propose a novel architecture capable to quickly infer an accurate depth map on a CPU, even of an embedded system, using a pyramid of features extracted from a single input image. Similarly to state-of-the-art, we train our network in an unsupervised manner casting depth estimation as an image reconstruction problem. Extensive experimental results on the KITTI dataset show that compared to the top performing approach our network has similar accuracy but a much lower complexity (about 6% of parameters) enabling to infer a depth map for a KITTI image in about 1.7 s on the Raspberry Pi 3 and at more than 8 Hz on a standard CPU. Moreover, by trading accuracy for efficiency, our network allows to infer maps at about 2 Hz and 40 Hz respectively, still being more accurate than most state-of-the-art slower methods. To the best of our knowledge, it is the first method enabling such performance on CPUs paving the way for effective deployment of unsupervised monocular depth estimation even on embedded systems.

Journal ArticleDOI
20 Mar 2018
TL;DR: A compressive single-pixel imaging approach that can simultaneously encode and recover spatial, spectral, and 3D information of the object in the Fourier space and detect the light signals using a single- pixel detector is reported.
Abstract: Single-pixel imaging can capture images using a detector without spatial resolution, which enables imaging in various situations that are challenging or impossible with conventional pixelated detectors. Here we report a compressive single-pixel imaging approach that can simultaneously encode and recover spatial, spectral, and 3D information of the object. In this approach, we modulate and condense the object information in the Fourier space and detect the light signals using a single-pixel detector. The data-compressing operation is similar to conventional compression algorithms that selectively store the largest coefficients of a transform domain. In our implementation, we selectively sample the largest Fourier coefficients, and no iterative optimization process is needed in the recovery process. We demonstrate an 88% compression ratio for producing a high-quality full-color 3D image. The reported approach provides a solution for information multiplexing in single-pixel imaging settings. It may also generate new insights for developing multi-modality computational imaging systems.

Journal ArticleDOI
TL;DR: The estimation of the injection coefficients at full resolution for regression-based pansharpening approaches is proposed and an iterative algorithm is proposed to achieve convergence and the reached asymptotic value is analytically calculated.
Abstract: Pansharpening is usually related to the fusion of a high spatial resolution but low spectral resolution (panchromatic) image with a high spectral resolution but low spatial resolution (multispectral) image. The calculation of injection coefficients through regression is a very popular and powerful approach. These coefficients are usually estimated at reduced resolution. In this paper, the estimation of the injection coefficients at full resolution for regression-based pansharpening approaches is proposed. To this aim, an iterative algorithm is proposed and studied. Its convergence, whatever the initial guess, is demonstrated in all the practical cases and the reached asymptotic value is analytically calculated. The performance is assessed both at reduced resolution and at full resolution on four data sets acquired by the IKONOS sensor and the WorldView-3 sensor. The proposed full scale approach always shows the best performance with respect to the benchmark consisting of state-of-the-art pansharpening methods.