scispace - formally typeset
Journal ArticleDOI: 10.1109/TCI.2021.3063872

SMFuse: Multi-Focus Image Fusion Via Self-Supervised Mask-Optimization

04 Mar 2021-IEEE Transactions on Computational Imaging (IEEE)-Vol. 7, pp 309-320
Abstract: In this paper, a novel self-supervised mask-optimization model, termed as SMFuse, is proposed for multi-focus image fusion. In our model, given two source images, a fully end-to-end Mask-Generator is trained to directly generate the binary mask without requiring any patch operation or postprocessing through self-supervised learning. On the one hand, based on the principle of repeated blur, we design a Guided-Block with guided filter to obtain an initial binary mask from source images, narrowing the solution domain and speeding up the convergence of the binary mask generation, which is constrained by a map loss. On the other hand, as the focused regions in source images show richer texture details than the defocused ones, i.e. , larger gradients, we also design a max-gradient loss between the fused image and source images as a follow-up optimization operation to ensure the fused image to be all-in-focus, forcing our model to learn a more accurate binary mask. Extensive experimental results conducted on two publicly available datasets substantiate the effectiveness and superiority of our SMFuse compared with the current state-of-the-art. Our code is publicly available online. 1 1 [Online]. Available: https://github.com/jiayi-ma/SMFuse .

... read more

Citations
  More

5 results found


Journal ArticleDOI: 10.1016/J.INFFUS.2021.06.008
Hao Zhang1, Han Xu1, Xin Tian1, Junjun Jiang2  +1 moreInstitutions (2)
01 Dec 2021-Information Fusion
Abstract: Image fusion, which refers to extracting and then combining the most meaningful information from different source images, aims to generate a single image that is more informative and beneficial for subsequent applications. The development of deep learning has promoted tremendous progress in image fusion, and the powerful feature extraction and reconstruction capabilities of neural networks make the fused results promising. Recently, several latest deep learning technologies have made image fusion explode, e.g., generative adversarial networks, autoencoder, etc. However, a comprehensive review and analysis of latest deep-learning methods in different fusion scenarios is lacking. To this end and in this survey, we first introduce the concept of image fusion, and classify the methods from the perspectives of the deep architectures adopted and fusion scenarios. Then, we review the state-of-the-art on the use of deep learning in various types of image fusion scenarios, including the digital photography image fusion, the multi-modal image fusion and the sharpening fusion. Subsequently, the evaluation for some representative methods in specific fusion tasks are performed qualitatively and quantitatively. Moreover, we briefly introduce several typical applications of image fusion, including photography visualization, RGBT object tracking, medical diagnosis, and remote sensing monitoring. Finally, we provide the conclusion, highlight the challenges in image fusion, and look forward to potential future research directions.

... read more

Topics: Image fusion (68%), Feature extraction (55%), Autoencoder (52%) ... read more

10 Citations


Journal ArticleDOI: 10.1016/J.JVCIR.2021.103328
Xiaole Ma1, Zhihai Wang1, Shaohai Hu1Institutions (1)
Abstract: Although colorful information in natural scenes can be collected, due to the limitation of camera depth of field, it is hard to capture an image with all-in-focus. Sparse representation (SR)-based methods have shown their powerful potentiality and ability in multi-focus image fusion. However, because of sparse coding and information compress, the existing fusion methods based on SR are imperfect to seize the rich details and significant texture information in source images. As a result, a fusion method based on multi-scale sparse representation for registered multi-focus images (MIF-MsSR) is proposed in this paper, where an adaptive fusion rule for sparse coefficients is presented. At first, source images are processed by multi-scale decomposition and sub-images with different scales can be obtained. According to image features with different richness in these sub-images, dictionaries with different sizes and redundancy are thereby trained. By comprehensively considering the relationships of focused areas, out-of-focused areas and boundary areas between the source images, an adaptive fusion rule based on l 0 − max and Sum Modified Laplacian (SML) is proposed. Finally, a fused image with all-in-focus can be obtained by sparse reconstruction and inverse multi-scale decomposition. Excessive experiments on multi-focus images have demonstrated that the proposed MIF-MsSR not only reserves the integrity of the information in source images, but also has better fusion performance on subjective and objective indicators than other state-of-the-art methods.

... read more

Topics: Sparse approximation (64%), Image fusion (64%), Neural coding (51%)

1 Citations


Journal ArticleDOI: 10.1109/TIM.2021.3124058
Yu Liu1, Lei Wang1, Juan Cheng1, Xun Chen2Institutions (2)
Abstract: In deep learning (DL)-based multifocus image fusion, effective multiscale feature learning is a key issue to promote fusion performance. In this article, we propose a novel DL model named multiscale feature interactive network (MSFIN), which can segment the source images into focused and defocused regions accurately by sufficient interaction of multiscale features from layers of different depths in the network for multifocus image fusion. Specifically, based on the popular encoder–decoder framework, two functional modules, namely, multiscale feature fusion (MSFF) and coordinate attention upsample (CAU), are designed for interactive multiscale feature learning. Moreover, the weighted binary cross-entropy (WBCE) loss and the multilevel supervision (MLS) strategy are introduced to train the network more effectively. Qualitative and quantitative comparisons with 19 representative multifocus image fusion methods demonstrate that the proposed method can achieve state-of-the-art performance. The code of our method is available at https://github.com/yuliu316316/MSFIN-Fusion .

... read more

Topics: Image fusion (57%), Feature (computer vision) (57%), Feature extraction (56%) ... read more

Journal ArticleDOI: 10.1016/J.SIGPRO.2021.108282
Zhao Duan1, Taiping Zhang1, Xiaoliu Luo1, Jin Tan1Institutions (1)
01 Dec 2021-Signal Processing
Abstract: In current multi-focus image fusion approaches with convolutional neural network (CNN), the same set of convolutional kernels is used to multi-focus images for feature extraction of all regions. However, the same kernels may not be optimal for all regions in multi-focus images, incurring artifacts in textureless and edge regions of the fused image. To address these problems, this paper proposes a dynamic convolutional kernel network (DCKN) for multi-focus image fusion, in which the convolutional kernels are dynamically generated from region context conditioned on input images. The kernels in the proposed architecture are not only position-varying but also sample-varying, which can adapt accurately to spatially variant blur caused by depth and texture variations in multi-focus images. Moreover, our DCKN works not only in supervised learning, but also in unsupervised learning. For supervised learning, the ground-truth fusion image is utilized to supervise the output fused image. For unsupervised learning, we introduce bright channel and total variation loss function to constraint the DCKN jointly. Bright channel metric can determine roughly whether source pixels are focused or not, which is utilized to guide the training process for the unsupervised network. Extensive experiments on popular multi-focus images show that our DCKN without any post-processing algorithms is comparable to state-of-the-art approaches, and our unsupervised model obtains high fusion quality.

... read more

Topics: Unsupervised learning (61%), Image fusion (60%), Convolutional neural network (59%) ... read more

Journal ArticleDOI: 10.1016/J.IMAGE.2021.116533
Shuaiqi Liu1, Jian Ma1, Yang Yang1, Tian Qiu2  +3 moreInstitutions (5)
Abstract: Multi-focus image fusion is a process of generating fused images by merging multiple images with different degrees of focus in the same scene. In multi-focus image fusion, the accuracy of the detected focus area is critical for improving the quality of the fused image. Combining the structural gradient, we propose a multi-focus color image fusion algorithm based on low vision image reconstruction and focus feature extraction. First, the source images are input into the deep residual network (ResNet) to conduct the low vision image reconstructed by the super-resolution method. Next, an end-to-end restoration model is used to improve the image details and maintain the edges of the image through rolling guidance filter. What is more, the difference image is obtained from the reconstructed image and the source image. Then, the fusion decision map is generated based on the focus area detection method based on structural gradient. Finally, the source image and the fusion decision map are used for weighted fusion to generate a fusion image. Experimental results show that our algorithm is quite accurate in detecting the edge of the focus area. Compared with other algorithms, the proposed algorithm improves the recognition accuracy of decision focus and defocused areas. It can well retain the detailed texture features and edge structure of the source image.

... read more

Topics: Image fusion (71%), Color image (64%), Feature extraction (62%) ... read more
References
  More

43 results found


Open accessBook ChapterDOI: 10.1007/978-3-319-10602-1_48
Tsung-Yi Lin1, Michael Maire2, Serge Belongie1, James Hays  +4 moreInstitutions (4)
06 Sep 2014-
Abstract: We present a new dataset with the goal of advancing the state-of-the-art in object recognition by placing the question of object recognition in the context of the broader question of scene understanding. This is achieved by gathering images of complex everyday scenes containing common objects in their natural context. Objects are labeled using per-instance segmentations to aid in precise object localization. Our dataset contains photos of 91 objects types that would be easily recognizable by a 4 year old. With a total of 2.5 million labeled instances in 328k images, the creation of our dataset drew upon extensive crowd worker involvement via novel user interfaces for category detection, instance spotting and instance segmentation. We present a detailed statistical analysis of the dataset in comparison to PASCAL, ImageNet, and SUN. Finally, we provide baseline performance analysis for bounding box and segmentation detection results using a Deformable Parts Model.

... read more

Topics: Object detection (54%)

18,843 Citations


Proceedings ArticleDOI: 10.1109/ICCV.1998.710815
Carlo Tomasi1, Roberto Manduchi2Institutions (2)
04 Jan 1998-
Abstract: Bilateral filtering smooths images while preserving edges, by means of a nonlinear combination of nearby image values. The method is noniterative, local, and simple. It combines gray levels or colors based on both their geometric closeness and their photometric similarity, and prefers near values to distant values in both domain and range. In contrast with filters that operate on the three bands of a color image separately, a bilateral filter can enforce the perceptual metric underlying the CIE-Lab color space, and smooth colors and preserve edges in a way that is tuned to human perception. Also, in contrast with standard filtering, bilateral filtering produces no phantom colors along edges in color images, and reduces phantom colors where they appear in the original image.

... read more

Topics: Color histogram (65%), Color balance (64%), Color quantization (64%) ... read more

8,053 Citations


Journal ArticleDOI: 10.1109/TPAMI.2012.213
Kaiming He1, Jian Sun1, Xiaoou Tang2Institutions (2)
Abstract: In this paper, we propose a novel explicit image filter called guided filter. Derived from a local linear model, the guided filter computes the filtering output by considering the content of a guidance image, which can be the input image itself or another different image. The guided filter can be used as an edge-preserving smoothing operator like the popular bilateral filter [1], but it has better behaviors near edges. The guided filter is also a more generic concept beyond smoothing: It can transfer the structures of the guidance image to the filtering output, enabling new filtering applications like dehazing and guided feathering. Moreover, the guided filter naturally has a fast and nonapproximate linear time algorithm, regardless of the kernel size and the intensity range. Currently, it is one of the fastest edge-preserving filters. Experiments show that the guided filter is both effective and efficient in a great variety of computer vision and computer graphics applications, including edge-aware smoothing, detail enhancement, HDR compression, image matting/feathering, dehazing, joint upsampling, etc.

... read more

Topics: Edge-preserving smoothing (74%), Composite image filter (66%), Bilateral filter (63%) ... read more

3,721 Citations


Open accessProceedings ArticleDOI: 10.1109/ICCV.2017.304
Xudong Mao1, Qing Li1, Haoran Xie2, Raymond Y. K. Lau  +2 moreInstitutions (3)
01 Oct 2017-
Abstract: Unsupervised learning with generative adversarial networks (GANs) has proven hugely successful. Regular GANs hypothesize the discriminator as a classifier with the sigmoid cross entropy loss function. However, we found that this loss function may lead to the vanishing gradients problem during the learning process. To overcome such a problem, we propose in this paper the Least Squares Generative Adversarial Networks (LSGANs) which adopt the least squares loss function for the discriminator. We show that minimizing the objective function of LSGAN yields minimizing the Pearson X2 divergence. There are two benefits of LSGANs over regular GANs. First, LSGANs are able to generate higher quality images than regular GANs. Second, LSGANs perform more stable during the learning process. We evaluate LSGANs on LSUN and CIFAR-10 datasets and the experimental results show that the images generated by LSGANs are of better quality than the ones generated by regular GANs. We also conduct two comparison experiments between LSGANs and regular GANs to illustrate the stability of LSGANs.

... read more

Topics: Unsupervised learning (52%)

2,429 Citations


Journal ArticleDOI: 10.1109/26.477498
Abstract: A number of quality measures are evaluated for gray scale image compression. They are all bivariate, exploiting the differences between corresponding pixels in the original and degraded images. It is shown that although some numerical measures correlate well with the observers' response for a given compression technique, they are not reliable for an evaluation across different techniques. A graphical measure called Hosaka plots, however, can be used to appropriately specify not only the amount, but also the type of degradation in reconstructed images.

... read more

Topics: Image quality (57%), Image compression (56%), Data compression (54%) ... read more

1,445 Citations


Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20221
20214