scispace - formally typeset
Proceedings ArticleDOI

Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection

TLDR
This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network.
Abstract
This study addresses the issue of fusing infrared and visible images that appear differently for object detection. Aiming at generating an image of high visual quality, previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks. These approaches neglect that modality differences implying the complementary information are extremely important for both fusion and subsequent detection task. This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network. The fusion network with one generator and dual discriminators seeks commons while learning from differences, which preserves structural information of targets from the infrared and textural details from the visible. Furthermore, we build a synchronized imaging system with calibrated infrared and optical sensors, and collect currently the most comprehensive benchmark covering a wide range of scenarios. Extensive experiments on several public datasets and our benchmark demonstrate that our method outputs not only visually appealing fusion but also higher detection mAP than the state-of-the-art approaches. The source code and benchmark are available at https://github.com/dlut-dimt/TarDAL.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Infrared and Visible Image Fusion via Decoupling Network

TL;DR: Wang et al. as mentioned in this paper proposed a decoupling network-based IVIF method (DNFusion), which utilizes the decoupled maps to design additional constraints on the network and force the network to retain the saliency information of the source image effectively.
Journal ArticleDOI

SuperFusion: A Versatile Image Registration and Fusion Network with Semantic Awareness

TL;DR: Tang et al. as discussed by the authors proposed a novel image registration and fusion method, named SuperFusion, which combines image registration, image fusion, and semantic requirements of high-level vision tasks into a single framework.
Journal ArticleDOI

Target Oriented Perceptual Adversarial Fusion Network for Underwater Image Enhancement

TL;DR: This work proposes a target oriented perceptual adversarial fusion network, dubbed TOPAL, which improves the quality of underwater images greatly and achieves superior performance than others.
Proceedings ArticleDOI

Unsupervised Misaligned Infrared and Visible Image Fusion via Cross-Modality Image Generation and Registration

TL;DR: A robust cross-modality generation-registration paradigm for unsupervised misaligned infrared and visible image fusion (IVIF) and introduces a Multi-level Refinement Registration Network (MRRN) to predict the displacement vector field between distorted and pseudo infrared images and reconstruct registered infrared image under the mono- modality setting.
Book ChapterDOI

ReCoNet: Recurrent Correction Network for Fast and Efficient Multi-modality Image Fusion

TL;DR: ReCoNet as discussed by the authors proposes a deformation module to explicitly compensate geometrical distortions and an attention mechanism to mitigate ghosting-like artifacts, respectively, which can effectively and efficiently alleviate both structural distortions and textural artifacts brought by slight misalignment.
References
More filters
Journal ArticleDOI

Image quality assessment: from error visibility to structural similarity

TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Proceedings ArticleDOI

Dual Attention Network for Scene Segmentation

TL;DR: New state-of-the-art segmentation performance on three challenging scene segmentation datasets, i.e., Cityscapes, PASCAL Context and COCO Stuff dataset is achieved without using coarse data.
Proceedings ArticleDOI

SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks

TL;DR: This work proves the core reason Siamese trackers still have accuracy gap comes from the lack of strict translation invariance, and proposes a new model architecture to perform depth-wise and layer-wise aggregations, which not only improves the accuracy but also reduces the model size.
Journal ArticleDOI

Image Fusion With Guided Filtering

TL;DR: Experimental results demonstrate that the proposed method can obtain state-of-the-art performance for fusion of multispectral, multifocus, multimodal, and multiexposure images.
Journal ArticleDOI

Information measure for performance of image fusion

TL;DR: The results show that the measure represents how much information is obtained from the input images and is meaningful and explicit.