Hyperspectral Unmixing Overview: Geometrical, Statistical, and Sparse Regression-Based Approaches
Summary (5 min read)
I. INTRODUCTION
- The focus here is on those covering the visible, near-infrared, and shortwave infrared spectral bands (in the range 0.3 to 2.5 [5] ).
- They are organized into planes forming a data cube.
- Each spectral vector corresponds to the radiance acquired at a given location for all spectral bands.
A. Linear and Nonlinear Mixing Models
- Hyperspectral unmixing (HU) refers to any process that separates the pixel spectra from a hyperspectral image into a col-lection of constituent spectra, or spectral signatures, called endmembers and a set of fractional abundances, one set per pixel.
- The endmembers are generally assumed to represent the pure materials present in the image and the set of abundances, or simply abundances, at each pixel to represent the percentage of each endmember that is present in the pixel.
- Suppose a hyperspectral image contains spectra measured from bricks laid on the ground, the mortar between the bricks, and two types of plants that are growing through cracks in the brick.
- One may suppose then that there are four endmembers.
- Linear mixing holds when the mixing scale is macroscopic [30] and the incident light interacts with just one material, as is the case in checkerboard type scenes [31] , [32] .
depicts
- Conversely, nonlinear mixing is usually due to physical interactions between the light scattered by multiple materials in the scene.
- Mixing at the classical level occurs when light is scattered from one or more objects, is reflected off additional objects, and eventually is measured by hyperspectral imager.
- Generally, however, the first order terms are sufficient and this leads to the bilinear model.
- The reason is that, despite its simplicity, it is an acceptable approximation of the light scattering mechanisms in many real scenarios.
- Others will be discussed throughout the rest of this paper.
B. Brief Overview of Nonlinear Approaches
- Radiative transfer theory (RTT) [48] is a well established mathematical model for the transfer of energy as photons interacts with the materials in the scene.
- They mainly differ from each other by the additivity constraints imposed on the mixing coefficients [63] .
- If such information is not available, these signatures have to be estimated from the data by using an endmember extraction algorithm.
- Mainly due to the difficulty of the issue, very few attempts have been conducted to address the problem of fully unsuper-vised nonlinear unmixing.
- Even more recently, same authors have shown in [74] that exact geodesic distances can be derived on any data manifold induced by a nonlinear mixing model, such as the generalized bilinear model introduced in [62] .
C. Hyperspectral Unmixing Processing Chain
- Fig. 4 shows the processing steps usually involved in the hyperspectral unmixing chain: atmospheric correction, dimensionality reduction, and unmixing, which may be tackled via the classical endmember determination plus inversion, or via sparse regression or sparse coding approaches.
- The atmosphere attenuates and scatterers the light and therefore affects the radiance at the sensor.
- There are, however, many hyperspectral unmixing approaches in which the endmember determination and inversion steps are implemented simultaneously.
- Each of these sections introduce the underlying mathematical problem and summarizes state-of-the-art algorithms to address such problem.
- Illustration of the concept of simplex of minimum volume containing the data for three data sets.
II. LINEAR MIXTURE MODEL
- Therefore, the fractional abundances are subject to the following constraints: (2) i.e., the fractional abundance vector (the notation indicates vector transposed) is in the standard -simplex (or unit -simplex).
- Fig. 5 illustrates a 2-simplex for a hypothetical mixing matrix containing three endmembers.
- The points in green denote spectral vectors, whereas the points in red are vertices of the simplex and correspond to the endmembers.
- The left hand side data set contains pure pixels, i.e, for any of the endmembers there is at least one pixel containing only the correspondent material; the data set in the middle does not contain pure pixels but contains at least spectral vectors on each facet.
A. Characterization of the Spectral Unmixing Inverse Problem
- Given the data set containing -dimensional spectral vectors, the linear HU problem is, with reference to the linear model (3), the estimation of the mixing matrix and of the fractional abundances vectors corresponding to pixels .
- To characterize the linear HU inverse problem, the authors use the signal-to-noise-ratio (SNR) where and are, respectively, the signal (i.e., ) and noise correlation matrices and denotes expected value.
- Besides SNR, the authors introduce the signal-to-noise-ratio spectral distribution (SNR-SD) defined as (4) where is the eigenvalue-eigenvector couple of ordered by decreasing value of .
- For SusgsP5SNR40, the singular values of the mixing matrix decay faster due to the high correlation of the USGS spectral signatures.
- Nevertheless the "big picture" is similar to that of SudP5SNR40 data set.
III. SIGNAL SUBSPACE IDENTIFICATION
- The number of endmembers present in a given scene is, very often, much smaller than the number of bands .
- Unsupervised subspace identification has been approached in many ways.
- NAPC is mathematically equivalent to MNF [90] and can be interpreted as a sequence of two principal component transforms: the first applies to the noise and the second applies to the transformed data set.
- This framework consists of several modules, where the dimension reduction is achieved by identifying a subset of exemplar pixels that convey the variability in a scene.
- HySime ( hyperspectral signal identification by minimum error) [83] adopts a minimum mean squared error based approach to infer the signal subspace.
A. Projection on the Signal Subspace
- Replacing by the observation model (3), the authors have As referred to before, projecting onto a signal subspace can yield large computational, storage, and SNR gains.
- The mean power of the projected noise term is then ( denotes mean value).
- The noise and the signal subspace were estimated with HySime [83] .
- The identified subspace dimension has dimension 18.
- This effectiveness can also be perceived from the scatter plots of the noisy (blue dots) and denoised (green dots) eigen-images 17 and 18 shown in the bottom right hand side figure.
B. Affine Set Projection
- From now on, the authors assume that the observed data set has been projected onto the signal subspace and, for simplicity of notation, they still represent the projected vectors as in (3) , that is, (5) where and .
- Hence, instead of one matrix of endmember spectra for the entire scene, there is a matrix of endmember spectra for each pixel for .
- Other methods can be applied to to ensure that the sum-to-one constraint is a better model, such as the following: a) Orthogonal projection: Use PCA to identify the affine set that best represent the observed data in the least squares sense and then compute the orthogonal projection of the observed vectors onto this set (see [119] for details).
- These effects are illustrated in Fig. 12 for the Rterrain data set.
- The figure on the left hand side plots the angles between the unprojected and the orthogonally projected vectors, as a function of the norm of the unprojected vectors.
IV. GEOMETRICAL BASED APPROACHES TO LINEAR SPECTRAL UNMIXING
- The geometrical-based approaches are categorized into two main categories: Pure Pixel (PP) based and Minimum Volume (MV) based.
- There are a few other approaches that will also be discussed.
A. Geometrical Based Approaches: Pure Pixel Based Algorithms
- The pure pixel based algorithms still belong to the MV class but assume the presence in the data of at least one pure pixel per endmember, meaning that there is at least one spectral vector on each vertex of the data simplex.
- In any case, these algorithms find the set of most pure pixels in the data.
- A new endmember is identified based on the angle it makes with the existing cone.
- Endmembers are selected from the LAMS using the notions of affine independence and similarity measures such as spectral angle, correlation, mutual information, or Chebyschev distance.
- Algorithms AVMAX and SVMAX were derived in [126] under a continuous optimization framework inspired by Winter's maximum volume criterium [73] , which underlies N-FINDR.
B. Geometrical Based Approaches: Minimum Volume Based Algorithms
- The MV approaches seek a mixing matrix that minimizes the volume of the simplex defined by its columns, referred to as , subject to the constraint that contains the observed spectral vectors.
- The optimization (11) minimizes a two term objective function, where the term measures the approximation error and the term measures the square of the volume of the simplex defined by the columns of .
- Fuzzy clustering algorithms allow every data point to be assigned to every cluster to some degree.
- Assuming that there are simplexes in the data, then the following objective function can be used to attempt to find endmember spectra and abundances for each simplex: (13) such that Here, represents the membership of the data point in the simplex.
- In the top right SISAL and MVC-NMF produce good results but VCA and N-FINDR shows a degradation in performance because there are no pure pixels.
V. STATISTICAL METHODS
- When the spectral mixtures are highly mixed, the geometrical based methods yields poor results because there are not enough spectral vectors in the simplex facets.
- Under the statistical framework, spectral unmixing is formulated as a statistical inference problem.
- The hyperparameters involved in the definition of the parameter priors are then assigned non-informative priors and are jointly estimated from the full posterior of the parameters and hyperparameters.
- This is the case with DECA [169] , [170] ; the abundance fractions are modeled as mixtures of Dirichlet densities, thus, automatically enforcing the constraints on abundance fractions imposed by the acquisition process, namely nonnegativity and constant sum.
- As the authors not really sure about the true endmembers, it is reasonable to conclude that the statistical approach is producing similar to or better estimates than the geometrical based algorithms.
VI. SPARSE REGRESSION BASED UNMIXING
- The spectral unmixing problem has recently been approached in a semi-supervised fashion, by assuming that the observed image signatures can be expressed in the form of linear combinations of a number of pure spectral signatures known in advance [173] - [175] (e.g., spectra collected on the ground by a field spectro-radiometer).
- Greedy algorithms such as the orthogonal matching pursuit (OMP) [181] and convex approximations replacing the norm with the norm, termed basis pursuit (BP), if , and basis pursuit denoising (BPDN) [179] , if , are alternative approaches to compute the sparsest solution.
- What is, perhaps, totally unexpected is that sparse vector of fractional abundances can be reconstructed by solving (20) or (21) provided that the columns of matrix are incoherent in a given sense [186] .
- The limitation imposed by the highly correlation of the spectral signatures is mitigated by the high level of sparsity most often observed in the hyperspectral mixtures.
- Furthermore, because the libraries are hardly acquired under the same conditions of the data sets under consideration, a delicate calibration procedure have to be carried out to adapt either the library to the data set or vice versa [173] .
VII. SPATIAL-SPECTRAL CONTEXTUAL INFORMATION
- Most of the unmixing strategies presented in the previous paragraphs are based on a objective criterion generally defined in the hyperspectral space.
- Similarly, the statistical-and sparsity-based algorithms of Sections V and VI exploit similar geometric constraints to penalize a standard data-fitting term (expressed as a likelihood function or quadratic error term).
- Such valuable information can be of great benefit for analyzing hyperspectral data.
- In [199] , abundance dependencies are modeled using Gaussian Markov random fields, which makes this approach particularly well adapted to unmix images with smooth abundance transition throughout the observed scene.
- The SPP is intended as a preprocessing module that can be used in combination with an existing spectral-based endmember extraction algorithm.
VIII. SUMMARY
- More than one decade after Keshava and Mustard's tutorial paper on spectral unmixing published in the IEEE Signal Processing Magazine [1] , effective spectral unmixing still remains an elusive exploitation goal and a very active research topic in the remote sensing community.
- The compendium of techniques presented in this work reflects the increasing sophistication of a field that is rapidly maturing at the intersection of many different disciplines, including signal and image processing, physical modeling, linear algebra and computing developments.
- A recent trend in hyperspectral imaging in general (and spectral unmixing in particular) has been the computationally efficient implementation of techniques using high performance computing (HPC) architectures [217] , [222] , [223] .
- Researchers have considered some distributions but not all.
- Finally, software tools and measurements for large scale quantitative analysis are needed to perform meaningful statistical analyses of algorithm performance.
Did you find this useful? Give us your feedback
Citations
1,604 citations
Additional excerpts
...The result is the so-called hyperspectral image (HSI)....
[...]
1,107 citations
871 citations
736 citations
Cites background from "Hyperspectral Unmixing Overview: Ge..."
...According to Bioucas-Dias [54] spectral unmixing (see, for example, [74,75]) purpose is to restore the ability of ascertain materials, i....
[...]
620 citations
References
47,133 citations
"Hyperspectral Unmixing Overview: Ge..." refers background in this paper
...The identification of the signal subspace is a model order inference problem to which information theoretic criteria like the minimum description length (MDL) [93], [94] or the Akaike information criterion (AIC) [95] comes to mind....
[...]
38,681 citations
[...]
18,609 citations
17,446 citations
15,225 citations
"Hyperspectral Unmixing Overview: Ge..." refers background or methods in this paper
...The limitation imposed to the sparse regression methods by the usual high correlation of the hyperspectral signatures is mitigated in [213], [214] by adding the Total Variation [211] regularization term, applied to the individual bands, to CSR problem (21)....
[...]
...Work [185] assumes that the endmembers are known and formulates a deconvolution problem, where a Total Variation regularizer [211] is applied to the spatial bands to enhance their resolution....
[...]
Related Papers (5)
Frequently Asked Questions (20)
Q2. Why are independent Gamma distributions elected as priors for the spectra?
Because of the sparse nature of the chemical spectral components, independent Gamma distributions are elected aspriors for the spectra.
Q3. What constraints are used to solve a spectral unmixing problem?
When formulated as an optimization problem (e.g., implemented by the geometrical-based algorithms detailed in Section IV), spectral unmixing usually relies on algebraic constraints that are inherent to the observationspace : positivity, additivity and minimum volume.
Q4. What is the reason for the nonlinear mixing?
nonlinear mixing is usually due to physical interactions between the light scattered by multiple materials in the scene.
Q5. What is the effect of chance constraints on the fractional abundances?
RMVES accounts for the noise effects in the observations by employing chance constraints, which act as soft constraints on the fractional abundances.
Q6. What is the way to put in evidence the impact of the angles between the library vectors?
In order to put in evidence the impact of the angles between the library vectors, and therefore the mutual coherence of the library [187], in the unmixing results, the authors organize the library into two subsets; the minimum angle between any two spectral signatures is higher the 7 degrees in the first set and lower than 4 in the second set.
Q7. What are some of the kernels that are designed to be flexible?
Some of these kernels are designed to be sufficiently flexible to allow several nonlinearity degrees (using, e.g., radial basis functions or polynomials expansions) while others are physics-inspired kernels [55].
Q8. How can the authors use sparse processing with libraries?
it may become necessary to include distributions or tree structured representations into sparse processing with libraries.
Q9. Why have they been used in linear unmixing applications?
They have probably been the most 5 http://www.agc.army.mil/hypercubeoften used in linear hyperspectral unmixing applications, perhaps because of their light computational burden and clear conceptual meaning.
Q10. What is the recent approach to the spectral unmixing problem?
The spectral unmixing problem has recently been approached in a semi-supervised fashion, by assuming that the observed image signatures can be expressed in the form of linear combinations of a number of pure spectral signatures known in advance [173]–[175] (e.g., spectra collected on the ground by a field spectro-radiometer).
Q11. How does the concept of thematic classification of hyperspectral images evolve?
as a prototypal task, thematic classification of hyperspectral images has recently motivated the development of a new class of algorithms that exploit both the spatial and spectral features contained in image.
Q12. Why are there many methods that have not been addressed in this manuscript?
Because of the high level of activity and limited space, there are many methods that have not been addressed directly in this manuscript.
Q13. What is the earliest work dealing with unmixing of multi-band images?
one of the earliest work dealing with linear unmixing of multi-band images (casted as a soft classification problem) explicitly attempts to highlight spatial correlations between neighboring pixels.
Q14. What is the way to find the boundary points of the data convex cone?
Convex cone analysis (CCA) [148], finds the boundary points of the data convex cone (it does not apply affine projection), what is very close to MV concept.
Q15. What is the difference between ICA and hyperspectral data?
ICA is based on the assumption of mutually independent sources (abundance fractions), which is not the case of hyperspectral data, since the sum of abundance fractions is constant, implying statistical dependence among them.
Q16. How do you get the estimates of the endmembers and fractional abundances?
The estimates of the endmembers and of the fractional abundances are obtained by a modification of the multiplicative update rules introduced in [147].
Q17. How can the authors penalize the volume of the recovered simplex?
According to the optimization perspective suggested above, penalizing the volume of the recovered simplex can be conducted by choosing an appropriate negative log-prior .
Q18. Why do the authors find that the CBPDN and CSR are often better than CLS and?
For this reason, and also due to the presence of noise and model mismatches, the authors have observed that the CBPDN and CSR often yields better unmixing results than CLS and FCLS.
Q19. How is the pixel correlation model adapted to unmix images?
In [199], abundance dependencies are modeled using Gaussian Markov random fields, which makes this approach particularly well adapted to unmix images with smooth abundance transition throughout the observed scene.
Q20. What is the alternating volume maximization (AVMAX) algorithm?
The alternating volume maximization (AVMAX) [126], inspired by N-FINDR, maximizes, in a cyclic fashion, the volume of the simplex defined by the endmembers with respect to only one endmember at one time.