scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A general framework for image fusion based on multi-scale transform and sparse representation

01 Jul 2015-Information Fusion (Elsevier)-Vol. 24, Iss: 24, pp 147-164
TL;DR: A general image fusion framework by combining MST and SR to simultaneously overcome the inherent defects of both the MST- and SR-based fusion methods is presented and experimental results demonstrate that the proposed fusion framework can obtain state-of-the-art performance.
About: This article is published in Information Fusion.The article was published on 2015-07-01. It has received 952 citations till now. The article focuses on the topics: Image fusion & Contourlet.
Citations
More filters
Journal ArticleDOI
TL;DR: It is concluded that although various image fusion methods have been proposed, there still exist several future directions in different image fusion applications and the researches in the image fusion field are still expected to significantly grow in the coming years.

871 citations


Cites background or methods from "A general framework for image fusio..."

  • ...propose a general image fusion framework based on multi-scale transform and sparse representation[85]....

    [...]

  • ...Combination of different transforms hybird wavelet-contourlet [84], multi-scale transform-SR[85], morphological component analysis-SR[86], contourlet-SR[87], IHS-retinainspired models [88], IHS-wavelet[89] coefficient and window based activity level measurement[85–87], choose-max and weightedaverage based coefficient combining method[85–87], component substitution[89], integration of component substitution and weighted average [88]...

    [...]

Journal ArticleDOI
TL;DR: This paper proposes a novel method to fuse two types of information using a generative adversarial network, termed as FusionGAN, which establishes an adversarial game between a generator and a discriminator, where the generator aims to generate a fused image with major infrared intensities together with additional visible gradients.

853 citations


Cites background or methods from "A general framework for image fusio..."

  • ...They can be simply divided into seven categories including multi-scale transform- [5, 6, 7], sparse representation- [8, 9], neural network- [10, 11], subspace- [12, 13], and saliency-based [14, 15] methods, hybrid models [16, 17], and other methods [18, 19]....

    [...]

  • ...The image fusion problem has been developed with different schemes including multiscale transform- [5, 6, 7], sparse representation- [8, 9], neural network- [10, 11], subspace- [12, 13], and saliency-based [14, 15] methods, hybrid models [16, 17], and other methods [18, 19]....

    [...]

  • ...Meanwhile, sparse representation-based fusion methods divide source images into several overlapping patches using a sliding window strategy, thereby potentially reducing visual artifacts and improving robustness to misregistration [16]....

    [...]

  • ...AC CE PT ED M AN US CR IP T models combine their advantages to improve the image fusion performance [16]....

    [...]

Journal ArticleDOI
Jiayi Ma1, Yong Ma1, Chang Li1
TL;DR: This survey comprehensively survey the existing methods and applications for the fusion of infrared and visible images, which can serve as a reference for researchers inrared and visible image fusion and related fields.

849 citations

Journal ArticleDOI
TL;DR: A new multi-focus image fusion method is primarily proposed, aiming to learn a direct mapping between source images and focus map, using a deep convolutional neural network trained by high-quality image patches and their blurred versions to encode the mapping.

826 citations

Journal ArticleDOI
TL;DR: A novel fusion algorithm, named Gradient Transfer Fusion (GTF), based on gradient transfer and total variation (TV) minimization is proposed, which can keep both the thermal radiation and the appearance information in the source images.

729 citations

References
More filters
Journal ArticleDOI
TL;DR: In this article, a structural similarity index is proposed for image quality assessment based on the degradation of structural information, which can be applied to both subjective ratings and objective methods on a database of images compressed with JPEG and JPEG2000.
Abstract: Objective methods for assessing perceptual image quality traditionally attempted to quantify the visibility of errors (differences) between a distorted image and a reference image using a variety of known properties of the human visual system. Under the assumption that human visual perception is highly adapted for extracting structural information from a scene, we introduce an alternative complementary framework for quality assessment based on the degradation of structural information. As a specific example of this concept, we develop a structural similarity index and demonstrate its promise through a set of intuitive examples, as well as comparison to both subjective ratings and state-of-the-art objective methods on a database of images compressed with JPEG and JPEG2000. A MATLAB implementation of the proposed algorithm is available online at http://www.cns.nyu.edu//spl sim/lcv/ssim/.

40,609 citations

Journal ArticleDOI
TL;DR: The authors introduce an algorithm, called matching pursuit, that decomposes any signal into a linear expansion of waveforms that are selected from a redundant dictionary of functions, chosen in order to best match the signal structures.
Abstract: The authors introduce an algorithm, called matching pursuit, that decomposes any signal into a linear expansion of waveforms that are selected from a redundant dictionary of functions. These waveforms are chosen in order to best match the signal structures. Matching pursuits are general procedures to compute adaptive signal representations. With a dictionary of Gabor functions a matching pursuit defines an adaptive time-frequency transform. They derive a signal energy distribution in the time-frequency plane, which does not include interference terms, unlike Wigner and Cohen class distributions. A matching pursuit isolates the signal structures that are coherent with respect to a given dictionary. An application to pattern extraction from noisy signals is described. They compare a matching pursuit decomposition with a signal expansion over an optimized wavepacket orthonormal basis, selected with the algorithm of Coifman and Wickerhauser see (IEEE Trans. Informat. Theory, vol. 38, Mar. 1992). >

9,380 citations


"A general framework for image fusio..." refers methods in this paper

  • ...(iii) Calculate the sparse coefficient vectors faA;aBg of fv̂A; v̂Bg using the orthogonal matching pursuit (OMP) algorithm [21] by...

    [...]

Journal ArticleDOI
TL;DR: A novel algorithm for adapting dictionaries in order to achieve sparse signal representations, the K-SVD algorithm, an iterative method that alternates between sparse coding of the examples based on the current dictionary and a process of updating the dictionary atoms to better fit the data.
Abstract: In recent years there has been a growing interest in the study of sparse representation of signals. Using an overcomplete dictionary that contains prototype signal-atoms, signals are described by sparse linear combinations of these atoms. Applications that use sparse representation are many and include compression, regularization in inverse problems, feature extraction, and more. Recent activity in this field has concentrated mainly on the study of pursuit algorithms that decompose signals with respect to a given dictionary. Designing dictionaries to better fit the above model can be done by either selecting one from a prespecified set of linear transforms or adapting the dictionary to a set of training signals. Both of these techniques have been considered, but this topic is largely still open. In this paper we propose a novel algorithm for adapting dictionaries in order to achieve sparse signal representations. Given a set of training signals, we seek the dictionary that leads to the best representation for each member in this set, under strict sparsity constraints. We present a new method-the K-SVD algorithm-generalizing the K-means clustering process. K-SVD is an iterative method that alternates between sparse coding of the examples based on the current dictionary and a process of updating the dictionary atoms to better fit the data. The update of the dictionary columns is combined with an update of the sparse representations, thereby accelerating convergence. The K-SVD algorithm is flexible and can work with any pursuit method (e.g., basis pursuit, FOCUSS, or matching pursuit). We analyze this algorithm and demonstrate its results both on synthetic tests and in applications on real image data

8,905 citations

Journal ArticleDOI
TL;DR: A technique for image encoding in which local operators of many scales but identical shape serve as the basis functions, which tends to enhance salient image features and is well suited for many image analysis tasks as well as for image compression.
Abstract: We describe a technique for image encoding in which local operators of many scales but identical shape serve as the basis functions. The representation differs from established techniques in that the code elements are localized in spatial frequency as well as in space. Pixel-to-pixel correlations are first removed by subtracting a lowpass filtered copy of the image from the image itself. The result is a net data compression since the difference, or error, image has low variance and entropy, and the low-pass filtered image may represented at reduced sample density. Further data compression is achieved by quantizing the difference image. These steps are then repeated to compress the low-pass image. Iteration of the process at appropriately expanded scales generates a pyramid data structure. The encoding process is equivalent to sampling the image with Laplacian operators of many scales. Thus, the code tends to enhance salient image features. A further advantage of the present code is that it is well suited for many image analysis tasks as well as for image compression. Fast algorithms are described for coding and decoding.

6,975 citations


"A general framework for image fusio..." refers methods in this paper

  • ...Classical MST-based fusion methods include pyramid-based ones like Laplacian pyramid (LP) [2], ratio of low-pass pyramid (RP) [3] and gradient pyramid (GP) [4], wavelet-based ones like discrete wavelet transform (DWT) [5], stationary wavelet transform (SWT) [6] and dual-tree complex wavelet transform (DTCWT) [7], and multi-scale geometric analysis (MGA)-based ones like curvelet transform (CVT) [8] and nonsubsampled contourlet transform (NSCT) [9]....

    [...]

Journal ArticleDOI
13 Jun 1996-Nature
TL;DR: It is shown that a learning algorithm that attempts to find sparse linear codes for natural scenes will develop a complete family of localized, oriented, bandpass receptive fields, similar to those found in the primary visual cortex.
Abstract: The receptive fields of simple cells in mammalian primary visual cortex can be characterized as being spatially localized, oriented and bandpass (selective to structure at different spatial scales), comparable to the basis functions of wavelet transforms. One approach to understanding such response properties of visual neurons has been to consider their relationship to the statistical structure of natural images in terms of efficient coding. Along these lines, a number of studies have attempted to train unsupervised learning algorithms on natural images in the hope of developing receptive fields with similar properties, but none has succeeded in producing a full set that spans the image space and contains all three of the above properties. Here we investigate the proposal that a coding strategy that maximizes sparseness is sufficient to account for these properties. We show that a learning algorithm that attempts to find sparse linear codes for natural scenes will develop a complete family of localized, oriented, bandpass receptive fields, similar to those found in the primary visual cortex. The resulting sparse image code provides a more efficient representation for later stages of processing because it possesses a higher degree of statistical independence among its outputs.

5,947 citations


"A general framework for image fusio..." refers background in this paper

  • ...Sparse representation addresses the signals’ natural sparsity, which is in accord with the physiological characteristics of human visual system [12]....

    [...]