Author
Xun Chen
Other affiliations: University of British Columbia, Hefei University of Technology
Bio: Xun Chen is an academic researcher from University of Science and Technology of China. The author has contributed to research in topics: Computer science & Artificial intelligence. The author has an hindex of 27, co-authored 143 publications receiving 3549 citations. Previous affiliations of Xun Chen include University of British Columbia & Hefei University of Technology.
Papers
More filters
TL;DR: A new multi-focus image fusion method is primarily proposed, aiming to learn a direct mapping between source images and focus map, using a deep convolutional neural network trained by high-quality image patches and their blurred versions to encode the mapping.
Abstract: Introduces Convolutional neural networks (CNNs) into the field of image fusion.Discusses the feasibility and superiority of CNNs used for image fusion.Proposes a state-of-the-art CNN-based multi-focus image fusion method.Exhibits the potential of CNNs for other-type image fusion issues.Puts forward some suggestions on the future study of CNN-based image fusion. As is well known, activity level measurement and fusion rule are two crucial factors in image fusion. For most existing fusion methods, either in spatial domain or in a transform domain like wavelet, the activity level measurement is essentially implemented by designing local filters to extract high-frequency details, and the calculated clarity information of different source images are then compared using some elaborately designed rules to obtain a clarity/focus map. Consequently, the focus map contains the integrated clarity information, which is of great significance to various image fusion issues, such as multi-focus image fusion, multi-modal image fusion, etc. However, in order to achieve a satisfactory fusion performance, these two tasks are usually difficult to finish. In this study, we address this problem with a deep learning approach, aiming to learn a direct mapping between source images and focus map. To this end, a deep convolutional neural network (CNN) trained by high-quality image patches and their blurred versions is adopted to encode the mapping. The main novelty of this idea is that the activity level measurement and fusion rule can be jointly generated through learning a CNN model, which overcomes the difficulty faced by the existing fusion methods. Based on the above idea, a new multi-focus image fusion method is primarily proposed in this paper. Experimental results demonstrate that the proposed method can obtain state-of-the-art fusion performance in terms of both visual quality and objective assessment. The computational speed of the proposed method using parallel computing is fast enough for practical usage. The potential of the learned CNN model for some other-type image fusion issues is also briefly exhibited in the experiments.
826 citations
TL;DR: A recently emerged signal decomposition model known as convolutional sparse representation (CSR) is introduced into image fusion to address this problem, motivated by the observation that the CSR model can effectively overcome the above two drawbacks.
Abstract: As a popular signal modeling technique, sparse representation (SR) has achieved great success in image fusion over the last few years with a number of effective algorithms being proposed. However, due to the patch-based manner applied in sparse coding, most existing SR-based fusion methods suffer from two drawbacks, namely, limited ability in detail preservation and high sensitivity to misregistration, while these two issues are of great concern in image fusion. In this letter, we introduce a recently emerged signal decomposition model known as convolutional sparse representation (CSR) into image fusion to address this problem, which is motivated by the observation that the CSR model can effectively overcome the above two drawbacks. We propose a CSR-based image fusion framework, in which each source image is decomposed into a base layer and a detail layer, for multifocus image fusion and multimodal image fusion. Experimental results demonstrate that the proposed fusion methods clearly outperform the SR-based methods in terms of both objective assessment and visual quality.
615 citations
TL;DR: This survey paper presents a systematic review of the DL-based pixel-level image fusion literature, summarized the main difficulties that exist in conventional image fusion research and discussed the advantages that DL can offer to address each of these problems.
Abstract: By integrating the information contained in multiple images of the same scene into one composite image, pixel-level image fusion is recognized as having high significance in a variety of fields including medical imaging, digital photography, remote sensing, video surveillance, etc. In recent years, deep learning (DL) has achieved great success in a number of computer vision and image processing problems. The application of DL techniques in the field of pixel-level image fusion has also emerged as an active topic in the last three years. This survey paper presents a systematic review of the DL-based pixel-level image fusion literature. Specifically, we first summarize the main difficulties that exist in conventional image fusion research and discuss the advantages that DL can offer to address each of these problems. Then, the recent achievements in DL-based image fusion are reviewed in detail. More than a dozen recently proposed image fusion methods based on DL techniques including convolutional neural networks (CNNs), convolutional sparse representation (CSR) and stacked autoencoders (SAEs) are introduced. At last, by summarizing the existing DL-based image fusion methods into several generic frameworks and presenting a potential DL-based framework for developing objective evaluation metrics, we put forward some prospects for the future study on this topic. The key issues and challenges that exist in each framework are discussed.
493 citations
TL;DR: Experimental results demonstrate that the proposed method can obtain more competitive performance in comparison to nine representative medical image fusion methods, leading to state-of-the-art results on both visual quality and objective assessment.
Abstract: As an effective way to integrate the information contained in multiple medical images with different modalities, medical image fusion has emerged as a powerful technique in various clinical applications such as disease diagnosis and treatment planning. In this paper, a new multimodal medical image fusion method in nonsubsampled shearlet transform (NSST) domain is proposed. In the proposed method, the NSST decomposition is first performed on the source images to obtain their multiscale and multidirection representations. The high-frequency bands are fused by a parameter-adaptive pulse-coupled neural network (PA-PCNN) model, in which all the PCNN parameters can be adaptively estimated by the input band. The low-frequency bands are merged by a novel strategy that simultaneously addresses two crucial issues in medical image fusion, namely, energy preservation and detail extraction. Finally, the fused image is reconstructed by performing inverse NSST on the fused high-frequency and low-frequency bands. The effectiveness of the proposed method is verified by four different categories of medical image fusion problems [computed tomography (CT) and magnetic resonance (MR), MR-T1 and MR-T2, MR and positron emission tomography, and MR and single-photon emission CT] with more than 80 pairs of source images in total. Experimental results demonstrate that the proposed method can obtain more competitive performance in comparison to nine representative medical image fusion methods, leading to state-of-the-art results on both visual quality and objective assessment.
381 citations
TL;DR: This paper proposes an infrared fusion image that combines infrared and visible images of the same scene to generate a composite image which can provide a more comprehensive description of the scene.
Abstract: The fusion of infrared and visible images of the same scene aims to generate a composite image which can provide a more comprehensive description of the scene. In this paper, we propose an infrared...
245 citations
Cited by
More filters
TL;DR: This review covers nearly every application and technology in the field of remote sensing, ranging from preprocessing to mapping, and a conclusion regarding the current state-of-the art methods, a critical conclusion on open challenges, and directions for future research are presented.
Abstract: Deep learning (DL) algorithms have seen a massive rise in popularity for remote-sensing image analysis over the past few years. In this study, the major DL concepts pertinent to remote-sensing are introduced, and more than 200 publications in this field, most of which were published during the last two years, are reviewed and analyzed. Initially, a meta-analysis was conducted to analyze the status of remote sensing DL studies in terms of the study targets, DL model(s) used, image spatial resolution(s), type of study area, and level of classification accuracy achieved. Subsequently, a detailed review is conducted to describe/discuss how DL has been applied for remote sensing image analysis tasks including image fusion, image registration, scene classification, object detection, land use and land cover (LULC) classification, segmentation, and object-based image analysis (OBIA). This review covers nearly every application and technology in the field of remote sensing, ranging from preprocessing to mapping. Finally, a conclusion regarding the current state-of-the art methods, a critical conclusion on open challenges, and directions for future research are presented.
1,181 citations
TL;DR: This paper proposes a novel method to fuse two types of information using a generative adversarial network, termed as FusionGAN, which establishes an adversarial game between a generator and a discriminator, where the generator aims to generate a fused image with major infrared intensities together with additional visible gradients.
Abstract: Infrared images can distinguish targets from their backgrounds on the basis of difference in thermal radiation, which works well at all day/night time and under all weather conditions. By contrast, visible images can provide texture details with high spatial resolution and definition in a manner consistent with the human visual system. This paper proposes a novel method to fuse these two types of information using a generative adversarial network, termed as FusionGAN. Our method establishes an adversarial game between a generator and a discriminator, where the generator aims to generate a fused image with major infrared intensities together with additional visible gradients, and the discriminator aims to force the fused image to have more details existing in visible images. This enables that the final fused image simultaneously keeps the thermal radiation in an infrared image and the textures in a visible image. In addition, our FusionGAN is an end-to-end model, avoiding manually designing complicated activity level measurements and fusion rules as in traditional methods. Experiments on public datasets demonstrate the superiority of our strategy over state-of-the-arts, where our results look like sharpened infrared images with clear highlighted targets and abundant details. Moreover, we also generalize our FusionGAN to fuse images with different resolutions, say a low-resolution infrared image and a high-resolution visible image. Extensive results demonstrate that our strategy can generate clear and clean fused images which do not suffer from noise caused by upsampling of infrared information.
853 citations
TL;DR: This survey comprehensively survey the existing methods and applications for the fusion of infrared and visible images, which can serve as a reference for researchers inrared and visible image fusion and related fields.
Abstract: Infrared images can distinguish targets from their backgrounds based on the radiation difference, which works well in all-weather and all-day/night conditions. By contrast, visible images can provide texture details with high spatial resolution and definition in a manner consistent with the human visual system. Therefore, it is desirable to fuse these two types of images, which can combine the advantages of thermal radiation information in infrared images and detailed texture information in visible images. In this work, we comprehensively survey the existing methods and applications for the fusion of infrared and visible images. First, infrared and visible image fusion methods are reviewed in detail. Meanwhile, image registration, as a prerequisite of image fusion, is briefly introduced. Second, we provide an overview of the main applications of infrared and visible image fusion. Third, the evaluation metrics of fusion performance are discussed and summarized. Fourth, we select eighteen representative methods and nine assessment metrics to conduct qualitative and quantitative experiments, which can provide an objective performance reference for different fusion methods and thus support relative engineering with credible and solid evidence. Finally, we conclude with the current status of infrared and visible image fusion and deliver insightful discussions and prospects for future work. This survey can serve as a reference for researchers in infrared and visible image fusion and related fields.
849 citations