scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Hyperspectral Unmixing Overview: Geometrical, Statistical, and Sparse Regression-Based Approaches

TL;DR: This paper presents an overview of un Mixing methods from the time of Keshava and Mustard's unmixing tutorial to the present, including Signal-subspace, geometrical, statistical, sparsity-based, and spatial-contextual unmixed algorithms.
Abstract: Imaging spectrometers measure electromagnetic energy scattered in their instantaneous field view in hundreds or thousands of spectral channels with higher spectral resolution than multispectral cameras. Imaging spectrometers are therefore often referred to as hyperspectral cameras (HSCs). Higher spectral resolution enables material identification via spectroscopic analysis, which facilitates countless applications that require identifying materials in scenarios unsuitable for classical spectroscopic analysis. Due to low spatial resolution of HSCs, microscopic material mixing, and multiple scattering, spectra measured by HSCs are mixtures of spectra of materials in a scene. Thus, accurate estimation requires unmixing. Pixels are assumed to be mixtures of a few materials, called endmembers. Unmixing involves estimating all or some of: the number of endmembers, their spectral signatures, and their abundances at each pixel. Unmixing is a challenging, ill-posed inverse problem because of model inaccuracies, observation noise, environmental conditions, endmember variability, and data set size. Researchers have devised and investigated many models searching for robust, stable, tractable, and accurate unmixing algorithms. This paper presents an overview of unmixing methods from the time of Keshava and Mustard's unmixing tutorial to the present. Mixing models are first discussed. Signal-subspace, geometrical, statistical, sparsity-based, and spatial-contextual unmixing algorithms are described. Mathematical problems and potential solutions are described. Algorithm characteristics are illustrated experimentally.

Summary (5 min read)

I. INTRODUCTION

  • The focus here is on those covering the visible, near-infrared, and shortwave infrared spectral bands (in the range 0.3 to 2.5 [5] ).
  • They are organized into planes forming a data cube.
  • Each spectral vector corresponds to the radiance acquired at a given location for all spectral bands.

A. Linear and Nonlinear Mixing Models

  • Hyperspectral unmixing (HU) refers to any process that separates the pixel spectra from a hyperspectral image into a col-lection of constituent spectra, or spectral signatures, called endmembers and a set of fractional abundances, one set per pixel.
  • The endmembers are generally assumed to represent the pure materials present in the image and the set of abundances, or simply abundances, at each pixel to represent the percentage of each endmember that is present in the pixel.
  • Suppose a hyperspectral image contains spectra measured from bricks laid on the ground, the mortar between the bricks, and two types of plants that are growing through cracks in the brick.
  • One may suppose then that there are four endmembers.
  • Linear mixing holds when the mixing scale is macroscopic [30] and the incident light interacts with just one material, as is the case in checkerboard type scenes [31] , [32] .

depicts

  • Conversely, nonlinear mixing is usually due to physical interactions between the light scattered by multiple materials in the scene.
  • Mixing at the classical level occurs when light is scattered from one or more objects, is reflected off additional objects, and eventually is measured by hyperspectral imager.
  • Generally, however, the first order terms are sufficient and this leads to the bilinear model.
  • The reason is that, despite its simplicity, it is an acceptable approximation of the light scattering mechanisms in many real scenarios.
  • Others will be discussed throughout the rest of this paper.

B. Brief Overview of Nonlinear Approaches

  • Radiative transfer theory (RTT) [48] is a well established mathematical model for the transfer of energy as photons interacts with the materials in the scene.
  • They mainly differ from each other by the additivity constraints imposed on the mixing coefficients [63] .
  • If such information is not available, these signatures have to be estimated from the data by using an endmember extraction algorithm.
  • Mainly due to the difficulty of the issue, very few attempts have been conducted to address the problem of fully unsuper-vised nonlinear unmixing.
  • Even more recently, same authors have shown in [74] that exact geodesic distances can be derived on any data manifold induced by a nonlinear mixing model, such as the generalized bilinear model introduced in [62] .

C. Hyperspectral Unmixing Processing Chain

  • Fig. 4 shows the processing steps usually involved in the hyperspectral unmixing chain: atmospheric correction, dimensionality reduction, and unmixing, which may be tackled via the classical endmember determination plus inversion, or via sparse regression or sparse coding approaches.
  • The atmosphere attenuates and scatterers the light and therefore affects the radiance at the sensor.
  • There are, however, many hyperspectral unmixing approaches in which the endmember determination and inversion steps are implemented simultaneously.
  • Each of these sections introduce the underlying mathematical problem and summarizes state-of-the-art algorithms to address such problem.
  • Illustration of the concept of simplex of minimum volume containing the data for three data sets.

II. LINEAR MIXTURE MODEL

  • Therefore, the fractional abundances are subject to the following constraints: (2) i.e., the fractional abundance vector (the notation indicates vector transposed) is in the standard -simplex (or unit -simplex).
  • Fig. 5 illustrates a 2-simplex for a hypothetical mixing matrix containing three endmembers.
  • The points in green denote spectral vectors, whereas the points in red are vertices of the simplex and correspond to the endmembers.
  • The left hand side data set contains pure pixels, i.e, for any of the endmembers there is at least one pixel containing only the correspondent material; the data set in the middle does not contain pure pixels but contains at least spectral vectors on each facet.

A. Characterization of the Spectral Unmixing Inverse Problem

  • Given the data set containing -dimensional spectral vectors, the linear HU problem is, with reference to the linear model (3), the estimation of the mixing matrix and of the fractional abundances vectors corresponding to pixels .
  • To characterize the linear HU inverse problem, the authors use the signal-to-noise-ratio (SNR) where and are, respectively, the signal (i.e., ) and noise correlation matrices and denotes expected value.
  • Besides SNR, the authors introduce the signal-to-noise-ratio spectral distribution (SNR-SD) defined as (4) where is the eigenvalue-eigenvector couple of ordered by decreasing value of .
  • For SusgsP5SNR40, the singular values of the mixing matrix decay faster due to the high correlation of the USGS spectral signatures.
  • Nevertheless the "big picture" is similar to that of SudP5SNR40 data set.

III. SIGNAL SUBSPACE IDENTIFICATION

  • The number of endmembers present in a given scene is, very often, much smaller than the number of bands .
  • Unsupervised subspace identification has been approached in many ways.
  • NAPC is mathematically equivalent to MNF [90] and can be interpreted as a sequence of two principal component transforms: the first applies to the noise and the second applies to the transformed data set.
  • This framework consists of several modules, where the dimension reduction is achieved by identifying a subset of exemplar pixels that convey the variability in a scene.
  • HySime ( hyperspectral signal identification by minimum error) [83] adopts a minimum mean squared error based approach to infer the signal subspace.

A. Projection on the Signal Subspace

  • Replacing by the observation model (3), the authors have As referred to before, projecting onto a signal subspace can yield large computational, storage, and SNR gains.
  • The mean power of the projected noise term is then ( denotes mean value).
  • The noise and the signal subspace were estimated with HySime [83] .
  • The identified subspace dimension has dimension 18.
  • This effectiveness can also be perceived from the scatter plots of the noisy (blue dots) and denoised (green dots) eigen-images 17 and 18 shown in the bottom right hand side figure.

B. Affine Set Projection

  • From now on, the authors assume that the observed data set has been projected onto the signal subspace and, for simplicity of notation, they still represent the projected vectors as in (3) , that is, (5) where and .
  • Hence, instead of one matrix of endmember spectra for the entire scene, there is a matrix of endmember spectra for each pixel for .
  • Other methods can be applied to to ensure that the sum-to-one constraint is a better model, such as the following: a) Orthogonal projection: Use PCA to identify the affine set that best represent the observed data in the least squares sense and then compute the orthogonal projection of the observed vectors onto this set (see [119] for details).
  • These effects are illustrated in Fig. 12 for the Rterrain data set.
  • The figure on the left hand side plots the angles between the unprojected and the orthogonally projected vectors, as a function of the norm of the unprojected vectors.

IV. GEOMETRICAL BASED APPROACHES TO LINEAR SPECTRAL UNMIXING

  • The geometrical-based approaches are categorized into two main categories: Pure Pixel (PP) based and Minimum Volume (MV) based.
  • There are a few other approaches that will also be discussed.

A. Geometrical Based Approaches: Pure Pixel Based Algorithms

  • The pure pixel based algorithms still belong to the MV class but assume the presence in the data of at least one pure pixel per endmember, meaning that there is at least one spectral vector on each vertex of the data simplex.
  • In any case, these algorithms find the set of most pure pixels in the data.
  • A new endmember is identified based on the angle it makes with the existing cone.
  • Endmembers are selected from the LAMS using the notions of affine independence and similarity measures such as spectral angle, correlation, mutual information, or Chebyschev distance.
  • Algorithms AVMAX and SVMAX were derived in [126] under a continuous optimization framework inspired by Winter's maximum volume criterium [73] , which underlies N-FINDR.

B. Geometrical Based Approaches: Minimum Volume Based Algorithms

  • The MV approaches seek a mixing matrix that minimizes the volume of the simplex defined by its columns, referred to as , subject to the constraint that contains the observed spectral vectors.
  • The optimization (11) minimizes a two term objective function, where the term measures the approximation error and the term measures the square of the volume of the simplex defined by the columns of .
  • Fuzzy clustering algorithms allow every data point to be assigned to every cluster to some degree.
  • Assuming that there are simplexes in the data, then the following objective function can be used to attempt to find endmember spectra and abundances for each simplex: (13) such that Here, represents the membership of the data point in the simplex.
  • In the top right SISAL and MVC-NMF produce good results but VCA and N-FINDR shows a degradation in performance because there are no pure pixels.

V. STATISTICAL METHODS

  • When the spectral mixtures are highly mixed, the geometrical based methods yields poor results because there are not enough spectral vectors in the simplex facets.
  • Under the statistical framework, spectral unmixing is formulated as a statistical inference problem.
  • The hyperparameters involved in the definition of the parameter priors are then assigned non-informative priors and are jointly estimated from the full posterior of the parameters and hyperparameters.
  • This is the case with DECA [169] , [170] ; the abundance fractions are modeled as mixtures of Dirichlet densities, thus, automatically enforcing the constraints on abundance fractions imposed by the acquisition process, namely nonnegativity and constant sum.
  • As the authors not really sure about the true endmembers, it is reasonable to conclude that the statistical approach is producing similar to or better estimates than the geometrical based algorithms.

VI. SPARSE REGRESSION BASED UNMIXING

  • The spectral unmixing problem has recently been approached in a semi-supervised fashion, by assuming that the observed image signatures can be expressed in the form of linear combinations of a number of pure spectral signatures known in advance [173] - [175] (e.g., spectra collected on the ground by a field spectro-radiometer).
  • Greedy algorithms such as the orthogonal matching pursuit (OMP) [181] and convex approximations replacing the norm with the norm, termed basis pursuit (BP), if , and basis pursuit denoising (BPDN) [179] , if , are alternative approaches to compute the sparsest solution.
  • What is, perhaps, totally unexpected is that sparse vector of fractional abundances can be reconstructed by solving (20) or (21) provided that the columns of matrix are incoherent in a given sense [186] .
  • The limitation imposed by the highly correlation of the spectral signatures is mitigated by the high level of sparsity most often observed in the hyperspectral mixtures.
  • Furthermore, because the libraries are hardly acquired under the same conditions of the data sets under consideration, a delicate calibration procedure have to be carried out to adapt either the library to the data set or vice versa [173] .

VII. SPATIAL-SPECTRAL CONTEXTUAL INFORMATION

  • Most of the unmixing strategies presented in the previous paragraphs are based on a objective criterion generally defined in the hyperspectral space.
  • Similarly, the statistical-and sparsity-based algorithms of Sections V and VI exploit similar geometric constraints to penalize a standard data-fitting term (expressed as a likelihood function or quadratic error term).
  • Such valuable information can be of great benefit for analyzing hyperspectral data.
  • In [199] , abundance dependencies are modeled using Gaussian Markov random fields, which makes this approach particularly well adapted to unmix images with smooth abundance transition throughout the observed scene.
  • The SPP is intended as a preprocessing module that can be used in combination with an existing spectral-based endmember extraction algorithm.

VIII. SUMMARY

  • More than one decade after Keshava and Mustard's tutorial paper on spectral unmixing published in the IEEE Signal Processing Magazine [1] , effective spectral unmixing still remains an elusive exploitation goal and a very active research topic in the remote sensing community.
  • The compendium of techniques presented in this work reflects the increasing sophistication of a field that is rapidly maturing at the intersection of many different disciplines, including signal and image processing, physical modeling, linear algebra and computing developments.
  • A recent trend in hyperspectral imaging in general (and spectral unmixing in particular) has been the computationally efficient implementation of techniques using high performance computing (HPC) architectures [217] , [222] , [223] .
  • Researchers have considered some distributions but not all.
  • Finally, software tools and measurements for large scale quantitative analysis are needed to perform meaningful statistical analyses of algorithm performance.

Did you find this useful? Give us your feedback

Figures (19)

Content maybe subject to copyright    Report

354 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSIN G, VOL. 5, NO. 2, A
PRIL 2012
Hyperspectral Unmixing Overview: Geometrical,
Statistical, and Sparse Regression-Based Approaches
José M. Bioucas-Dias, Member, IEEE, Antonio Plaza, Senior Member, IEEE, Nicolas Dobigeon, Member, IEEE,
Mario Parente, Member, IEEE,QianDu, Senior Member, IEEE, Paul Gader, Fellow, IEEE,and
Jocelyn Chanussot, Fellow, IEEE
Abstract—Imaging spectrometers measure electromagnetic
energy scattered in their instantaneous eld view in hundreds or
thousands of spectral channels with higher spectral resolution than
multispectral cameras. Imaging spectrometers are therefore often
referred to as hyperspectral cameras (HSCs). H igher spectral res-
olution enables material identication via spectroscopic analysis,
which facilitates countless applications that require identifying
materials in scenarios unsuitable for classical spectroscopic anal-
ysis. Due to low spatial resolution of HSCs, microscopic material
mixing, and multiple scattering, spectra measured by HSCs are
mixtures of spectra of materials in a scene. Thus, accurate estima-
tion requires unmixing. Pixels are assumed to be mixtures of a few
materials, called endmembers. Unmixing involves estimating all or
some of: the number of endmembers, their spectral signatures, and
their abundances at each pixel. Unmixing is a challenging, ill-posed
inverse p roblem because of model inaccuracies, observation noise,
environmental conditions, endmember variability, and data set
size. Researchers have devised and investigated many models
searching for robust, stable, tractable, and accurate unmixing
algorithms. This paper presents an overview of unmixing methods
from the time of Keshava and M ustard’s unmixing tutorial [1] to
the present. Mixing models are rst discussed. Signal-subspace,
geometrical, statistical, sparsity-based, and spatial-contextual
unmixing algorithms are described. Mathematical problems and
potential solutions are described. Algorithm characteristics are
illustrated experimentally.
Index Terms—Hyperspectral imaging, hyperspectral remote
sensing, image analysis, image processing, imaging spectroscopy,
inverse problems, linear mixture, machine learning algorithms,
nonlinear mixtures, pattern recognition, remote sensing, sparsity,
spectroscopy, unmixing.
Manuscript received February 27, 2012; accepted March 22, 2012. Date of
publication May 15, 2012; date of current version May 23, 2012.
J. M . Bioucas-Dias is with the Instituto de Telecomunicações, Instituto
Superior Técnico, Technical University of Lisbon, Lisbon, Portugal (e-mail:
bioucas@lx.it.pt).
A. Plaza is with the Hyperspectral Computing Laboratory, Department of
Technology of Computers and Communications, University of Extremadura,
10003 Caceres, Spain (e-m ail: aplaza@unex.es).
N. Dobigeon is with the University of Toulouse, IRIT/INP-EN-
SEEIHT/TeSA, Toulouse, F rance (e-mail: Nicolas.Dobigeon@enseeiht.fr).
M. Parente is with the Department of Electrical and Com puter Engineering,
University of Massachus etts Amherst, Amherst, MA 01 0 03 USA (e-mail:
mparente@ecs.umass.edu).
Q. Du is wit h the Department of Electrical and Computer Engineering, Mis-
sissippi State Univ ersity, Mississippi State, MS 39762 USA (e-m ail: du@ece.
msstate.edu).
P. Gader is with the Department of Com puter and Information Science and
Engineering, University of Florida, Gainesville, FL 32611 USA and GIPSA-
Lab, Grenoble In stitute of Technology, Grenoble, France (corresponding author,
e-mail: pgader@cise.u.edu).
J. Chanussot is with the GIPSA-Lab, Greno ble Institute of Technology,
Grenoble, France (e-mail: jocelyn.chanussot@gipsa-lab.grenoble-inp.fr).
Color versions of one or more of the gures in this paper are available online
at http://ieeexplore.ieee.org.
Dig
ital Object Identi er 10.1109/JSTARS.2012.2194696
Fig. 1. Hyperspectral imaging concep t.
I. INTRODUCTION
H
YPERSPECTRAL cameras [1]–[11] contribute signi-
cantly to earth observation and remote sensing [12], [13].
Their potential motivates the development of small, commer-
cial, high spatial and spectral resolution instruments. They have
also been used in food safety [14]–[17], pharmaceutical process
monitoring and quality control [18]–[22], and biomedical, in-
dustrial, and biometric, and forensic applications [23]–[27].
HSCs can be built to function in many regions of th e electro-
magnetic spectrum. The focus here is on those covering the vis-
ible, near-infrared, and shortwave infrared spectral bands (in the
range 0.3
to 2.5 [5]). Disregard ing atmospheric effects,
the signal recorded by an HSC at a pixel is a mixture of light
scattered by substances located in the eldofview[3].Fig.1
illustrates the measured data. They are organized into planes
forming a data cube. Each plane corresponds to radiance ac-
quired over a spectral band for all pixels. Each spectral vector
corresponds to the radiance acquiredatagivenlocationforall
spectral bands.
A. Linear and Nonlinear Mixing Models
Hyper
spectral unmixing (HU) refers to any process that sep-
arat
es the pixel spectra from a hyperspectral image into a col-
1939-1404/$31.00 © 2012 IEEE

BIOUCAS-DIAS et al.: HYPERSPECTRAL UNMIXING OVERVIEW: GEOMETRICAL, STATISTICAL, AND SPARSE REGR
ESSION-BASED APPROACHES 355
lection of constituent spectra, or spectral signatures, called end-
members and a set of fractional abundances, one set per pixel.
The endmembers are generally assumed to represent the pure
materials present in the image an d the set of abundances, or
simply abundances, at each pixel to represent the percentage of
each endmember that is present in the pixel.
There are a n um ber of subtleties in this denition. Firs t, the
notion of a pure material can be subjective and prob lem de-
pendent. For example, suppose a hyperspectral image contains
spectra measured from bricks laid on the gro und , the mortar
between the bricks, an d two types of plants t hat are growing
through cracks in the brick. One m ay suppose then that there
are four endmembers. However, if the percentage of area that
is covered by the mortar is very small then we may not want
to have an endmember for mortar. We may just want an end-
member for “brick”. It depends on if we have a need to directly
measure t he proportion of mortar present. If we have need to
measure the m ortar, then we may not care to distinguish be-
tween the plants since th ey may have similar sig natu res. On t he
other hand, suppose that one type of plant is desirable and the
other is an invasive plant that needs to be removed. Then we
may want two plant endmembers. Furthermore, one may only
be interested in the chlorophyll present in the entire scene. Ob-
viously, this discussion can be continued ad nauseum butitis
clear that the denition of the endmembers can depend upon the
application.
The second subtlety is with the p roportions. Most researchers
assume that a proportion represents the percentage of material
associated with an endmember present in the part of the scene
imaged by a particular p ixel. Indeed, Hapke [28] states that the
abundances in a linear mixture represent the relative area of
the corresponding en dm emb er in an imaged region. Lab experi-
ments conducted by some of the authors have conrmed this in a
laboratory setting. However, in the nonlinear case, the situatio n
is not as straightfor ward. F or example, calibration objects can
sometimes be used to m ap hyperspectral measurements to re-
ectance, or at least to relative reectance. Therefore, the coor-
dinates of the endmembers are approximations to the reectance
of the m aterial, which we may assume for the sake of argu-
ment to be accurate. The reectance is usually not a linear func-
tion of the mass of the material nor is it a linear function of
the cross-s ectional area of the material. A highly reective, yet
small object may dominate a much larg er but dark object at a
pixel, which may lead to inaccurate estimates of the amount of
material present in the region imaged by a pixel, but accurate
estimates of the contribution of each material to the reectivity
measured at the pixel. Regardless of t hese subtleties, the large
number of applications of hyperspectral research in the past ten
years indicates that current models h ave value.
Unmixing algorithms currently rely on the expected type of
mixing. Mixing models can be characterized as either linear or
nonlinear [1], [29]. Linear m ix ing holds when the mixing scale
is macroscopic [30 ] and the incident light interacts w ith just one
material, as is the case in checkerboard type scenes [31], [32]. In
this case, the mixing occurs within the instrument itself. It is due
to the fact that the resolu tion of the instrument is not ne enough.
The light from the materials, although almost completely sepa-
rated, is mixed within the measuring instrument. Fig. 2 depicts
Fig. 2. Linear mixing. The measured radiance at a pixel is a weighted average
of the radiances of the m aterials present at the pixel.
linear mixing: Light scattered by three materials in a scene is
incident on a detector that measures radiance in
bands. The
measured spectrum
is a weighted average of the mate-
rial spectra. The relative amount of each material is represented
by the associated weight.
Conversely, nonlinear mixing is usually due to physical inter-
actions be tween the li ght scattered by multiple mater ials in the
scene. These interactions can be at a classical,ormultilayered,
leveloratamicroscopic,orintimate, level. Mixing at the clas-
sical level occurs when light is scattered from one or more ob-
jects, is reected off additional objects, and eventually is mea-
sured by hyperspectral imager. A nice illustrative derivation of
a multilayer model is given by Borel and Gerstl [33] who show
that the model results in an i nnite sequence of p owers of prod-
ucts of reectances. Generally, however, the rst order terms
are sufcient and this leads to t he bi linear model. Microscopic
mixing occurs when two m aterials are homogeneously mixed
[28]. In this case, the interactions consist of photons em itt ed
from molecules of one ma terial are absorbed b y m olecules o f
another material, which may in turn emit more photons. The
mixing is modeled by H ap ke as occurring at the albedo level and
notatthereectance level. The apparent albedo of the mixture is
a linear average of the albedos of the individu al substances bu t
the reectance is a nonlinear function of albedo, thus leading to
a d ifferent type of nonlin ear model.
Fig. 3 illustrates two non-linear mixing scenarios: the left-
hand panel represents an intimate mixture, meaning that the ma-
terials are in close pr o x imity; the right- hand panel illustrates a
multilayered scene, where there are multiple interactions among
the scatterers at the different layers.
Most of this overview is devo ted to the linear mixing model.
The r eason is that, despite its simplicity, it is an acceptable
approximation of the light scattering mechanisms in many real
scenarios. Furthermore, in contrast to nonlinear mixing, the
linear mix ing model is the basis of a plethora of unmixing
models and algorithms spanning back at least 25 years. A sam -
pling can be found in [1], [34]–[47]). Othe rs will be discussed
throughout the rest of this paper.
B. Brief Overview of Nonlinear Approaches
Radiative transfer theory (RTT) [48] is a well established
mathematical mod el f or the transfer of energy as photons
interacts with the materials in the scene. A complete physics
based approach to nonlinear unm ixing would r equire inferring

356 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATION S AND REMOTE SENSING, VOL. 5, NO. 2, A
PRIL 2012
Fig. 3. Two nonlinear mixing scenarios. Left hand: intimate mixture; Right
hand: multilay e red scene.
the spectral sign a ture s and material densiti es based on the
RTT. Unfortunately, this is an extremely complex ill-posed
problem, relying on scene parameters very hard or impossible
to ob tain. The Hapke [31], Kulbelka-Munk [49] and Shkurato v
[50] scattering formulations are three approximation s for the
analytical solution to the RTT. The former h as been widely used
to study diffuse reection spectra in chemistry [51] whereas
the later two have been used, for example, in mineral unmixing
applications [1], [52].
One wide class of strategies is aimed at avoiding the complex
physical models usin g simpler but physics insp ired models, such
kernel methods. In [53] and following works [54]–[57], Broad-
water et al. have proposed several kernel-based unmixing al-
gorithms to specically account for intimate mixtures. Some of
these kernels are designed to be sufciently exible to allow
several nonlinearity degrees (using, e.g., radial basis functions
or polynomials expansions) while others are physics-inspired
kernels [55].
Conversely, bilinear models have been successively proposed
in [58]–[62] to handle scattering effects, e.g., occurring in t he
multilayered scene. These models generalize the standard linear
model by introducing additional i nteraction terms. They mainly
differ from each other by the additivity constraints imposed on
the mixing coefcients [63].
However, limit ations inh e rent to the u nmixing a lgorithms
that explicit ly rely on both models are twofold. F ir stly, they are
not multipurpose in the sense th at those d evelo ped to p rocess
intimate mixtures are in efcient in the multiple interaction
scenario (and vice versa). Secondly, they generally require the
prior knowledge of the endmemb er signatures. If such infor-
mation is not available, these signatures have to be estimated
from the d ata by using an endmember extraction alg orithm .
To achieve exibility, some have resorted to machine learning
strategies such as neural networks [64]–[70], to nonlinearly re-
duce dim ensionality or learn mo del parameters in a supervised
fashion from a collection of examples (see [35] and references
therein). T he poly nom ial post nonlinear mixing m od e l intro-
duced in [71] seems also to be sufciently versatile to cover a
wide class of nonlinearities. How ever, again, these algorithms
assumes t he prior knowledg e or extraction of the endmem bers.
Mainly d u e to the difculty of the issue, very few attempts
have been condu cted to address the problem of fully unsuper-
vised nonlinear unmixing. One must still concede that a sig-
nicant contribu tion has been carried by Heylen et al. in [72]
where a strategy is introduced to extract endmem bers that have
been nonlinearly mixed. The algo rithmic scheme is similar in
many resp ects to the well-kno wn N-FINDR algorithm [73]. The
key idea i s to maximize the simplex volume compu ted with
geodesic m easures on the data manifold. In this work, exact
geodesic distances are approximated by shortest-path distances
in a nearest-neighbor g raph. Even m ore recently, same authors
have shown in [74] that exact geodesic distances can be derived
on any data manifold induced by a nonlinear mixing model, such
as the generalized bilinear model introduced in [62].
Quite recently, Close and Gader have devised two methods
for f ully unsup ervised nonlinear unm ixing in the case of in-
timate mixtures [75], [76] based on Hapke’s average albedo
model cited above. One method assumes that each pixel is either
linearly or nonlinearly mixed. The other assum es that there can
be both nonlinear and linear mixing presen t in a single pixel.
The methods were shown to more accurately estimate physical
mixing parameters using measurements made by Mustard et al.
[56], [57], [64], [77] than existing techniques. There is still a
great deal of work to b e done, including evaluating the useful-
ness of combining bilinear m odels with average albedo models.
In summary, although researchers are beginning to expand
more aggressively into nonlinear mixing, the research is imma-
ture compared with linear mixing. There has been a tremendous
effort in the past decade to solve linear unmixing problems and
that is what will be discussed in the rest of this paper.
C. Hyperspectral U nm ixing Processing Chain
Fig. 4 shows the processing steps usually involved in the
hyperspectral unmixing chain: atmospheric correction, di-
mensionality reduction, and unm ixing, which may be tackled
via the classical endmember determination plus in version,
or via sparse regression or sparse coding approaches. Often,
endmember determination and inversion are implemented
simultaneously. B elow, we provide a brief characterization of
each of these steps:
1) A tmospheric correction. The atmosphere attenuates and
scatterers the light and therefore affects the radiance at the
sensor. The atmosp heric correction compensates fo r these
effects by converting radiance into reectance, which is
an intrinsic property of the materials. We stress, however,
that linear unmixing can be carried out directly on radiance
data.
2) Data reduction. The dimensionality of the space spanned
by spectra from an image is generally much lower than
available number of bands. Identifying appropriate sub-
spaces facilitates dimensionality reduction, improving
algorithm performance and complexity and data storage.
Furthermore, if the linear mixture model is accurate, the
signal sub sp ace dimension is one less than equal to t he
number of endmembers, a crucial gure in hyperspectral
unmixing.
3) Unmixing. The unmixing step con sists of identifying the
endmembers in the scene and the fractional abundances
at each pixel. Three general approaches w ill be discussed
here. Geometrical approaches exploit the fact that linearly

BIOUCAS-DIAS et al.: HYPERSPECTRAL UNMIXING OVERVIEW: GEOMETRICAL, STATISTICAL, AND SPARSE REGR
ESSION-BASED APPROACHES 357
Fig. 4. Schema tic diagram of the hyperspectral unmixing process.
Fig. 5. Illustration of the simplex set for ( is the convex h ull of the
columns of
, ). Green circles repre s ent spectral vectors. Red
circles represent vertic es of th e simplex and correspo nd to the en dmembers.
mixed vectors are in a simplex set or in a positive cone.
Statistical approaches focus on u sing parameter estim a-
tion techniques to determine endmember and abundance
parameters. Sparse regression approaches, which formu-
lates unmixing as a lin ear sparse regression p rob lem, in a
fashion similar to tha t of compressive sensing [78], [79].
This framework relies on the existence of spectral libraries
usually acquired in laboratory. A step forward, t erm e d
sparse coding [80], consists of learning the dictionary
from the data and, thus, av oid ing not only the need of
libraries but also calibration issues related to different
conditions under which the libraries and the data were
acquired.
4) Inversion. Given the observed spectral v ectors and the
identied endmembers, the inversion step consists of
solving a constrained optimization probl em whi ch mini -
mizes the residual between the o bserved spectral vectors
and the linear space spanned by the inferred spectral
signatures; the implicit fractional abundances are, very
often, constrained to be nonneg ativ e and to sum to one
(i.e., they belong to the probabili ty simplex). There are,
however, many hyperspectral unmixing approaches in
which the endmem ber determination and inversion steps
are implemented simultaneously.
The remainder of the paper is organized as follows. Section II
describes the linear spectral mixture model adopted as the base-
line model in this contribution. Section III describes techniques
for subspace identication.SectionsIV,,,VIIdescribefour
classes of techniques for endmember and fractional abundances
estimation under the linear spectral unmixing. Sections IV and
V are devoted to the longstanding geome trical and statistical
based approaches, respectively. Sections VI and VII are devoted
to the recently introduced sparse regression based unmixing and
to the exploitation of the spatial contextual information, respec-
tively. Each of these s ection s introduce the un derlying mathe-
matical problem and summarizes state-of-the-art algorithms to
address such problem.
Experimental results obtained f ro m simulated and real data
sets illustrating the potential and limitations of each class of al-
gorithms are described. The experiments do not constitute an
exhaustive comparison. Both code and data for all the exper-
iments described her e are available at http://www.lx.it.pt/~bi-
oucas/code/unmixing_o verv iew.zip. The paper concludes with
a s um m ary and discussion of plau sible future developments in
the area of spectral unmi xing.

358 IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATION S AND REMOTE SENSING, VOL. 5, NO. 2, A
PRIL 2012
Fig. 6. Illustration of the concept o f simplex o f minimum volume containin g the data for three data sets. The endme mbers in the left hand side and in the middle
are identiable by tting a simplex of minimum volume to the data, whereas this is not applicable to the righ t hand side data set. The former data set correspond
to a highly mix ed scenario.
II. LINEAR MIXTURE MODEL
If the mul tip le scattering among distinct endm emb e rs is neg-
ligible and the surface is partitioned according to the fractional
abundances, as illustrated in Fig. 2, then the spectrum of each
pixel is well approximated by a linear mixture o f endmember
spectra weighted by the corresponding fractional abundances
[1], [3], [29], [ 39]. In this case, the spectral measurement
1
at
channel
( is the total number o f channels)
from a given pixel, den oted by
,isgivenbythelinear mixing
model (LMM)
(1)
where
denotes the spectral measurement of endmember
at the spectral band, denotes th e frac-
tional abu ndan ce of endmember
, denotes an additive per-
turbation (e.g., noise and modeling er rors), and
denotes the
number o f endmembers. At a giv en pixel, the fractional abun-
dance
, as the name suggests, represents the fractio nal area
occupied by the
th endmem ber. Therefore, the fractional abun-
dances are subject to the following constraints:
(2)
i.e., the fractional abundance vector
(the notation indicates vector transposed) is in the standard
-simplex (or unit -simplex). In HU jargon, the
nonnegativity and the sum-to-o ne constraints are termed a bun -
dance nonnegativity constraint (ANC) and abunda nce sum con-
straint (A SC), respectively. Researchers may sometimes expect
that the abundance fractions su m to less than one since an algo -
rithm m ay not be able to account for every m aterial in a pixel;
it is not clear whether it is better to relax the constraint or to
simply consider that part of the modeling e rror.
Let
denote a -dimensional column vector, and
denote the spectral signatu re of the
th endmember. Expression (1) can then be written as
(3)
1
Although the typ e of spe ctral quantity (radiance, reectance, etc.) is impor-
tant when processing data, specication is not necessary to derive the math e-
matical approaches.
where is the mixing matrix containing
the signatures of the endmembers present in the covered area,
and
. Assuming that the columns of are
afnely independent, i.e.,
are linearly independent, then the set
i.e., the convex hull of the columns of ,isa -simplex
in
. Fig. 5 illustrates a 2-simplex for a hypothetical mixing
matrix
containing three endmembers. The points in green de-
note spectral vectors, whereas the points in red are vertices of
the simplex and corresp ond t o the e nd members. Note that the
inference of the mixing matrix
is equivalent to identifying
the vertices of the simplex
. This geometrical point of view,
exploited by many unmixing algorithms, will be fu rt her devel-
oped in Section IV-B.
Since m any algorithms adopt either a geometrical or a sta-
tistical framework [34], [36], they are a focus of this paper. To
motivate these two dir ectio ns, let us consid er the th ree d ata sets
shown in Fig. 6 generated under the linear model given in (3)
where the noise is assumed to be negligible. The spectral v ec-
tors generated according to (3) are in a simplex whose vertices
correspond to the endmembers. The l eft hand side data set con-
tains pure pixels, i.e, for any of the
endmembers th ere is at
least one pixel containing only the correspondent material; the
data set in the middle does not contain pure pixels but contains
at least
spectral vectors on each facet. In both data sets
(left and middle), the endm em bers may by inferred by tting a
minimum volume (MV) simplex to the data; this rather simple
and yet powerful idea, introduced by Craig in his seminal work
[81], underlies several geometrical based unmixing algorithms.
A sim ilar idea was introduced in 1989 by Perczel in the area of
Chemometrics et al. [82].
The MV simplex show n in the right hand side example of
Fig. 6 is smaller than the true one. T his s ituation corresponds
to a highly mixed data set where there are no spectral vectors
near the facets. For these classes of problems, w e usually re-
sort to the statistical framework in which the estimation of the
mixing matrix and of the fractio nal abundances are formulated
as a statistical inference problem by adop tin g suitable proba-
bility models for the variables and p arameters involved, namely
for the fractional abundances and for the mixing matri x.

Citations
More filters
Journal ArticleDOI
TL;DR: A tutorial/overview cross section of some relevant hyperspectral data analysis methods and algorithms, organized in six main topics: data fusion, unmixing, classification, target detection, physical parameter retrieval, and fast computing.
Abstract: Hyperspectral remote sensing technology has advanced significantly in the past two decades. Current sensors onboard airborne and spaceborne platforms cover large areas of the Earth surface with unprecedented spectral, spatial, and temporal resolutions. These characteristics enable a myriad of applications requiring fine identification of materials or estimation of physical parameters. Very often, these applications rely on sophisticated and complex data analysis methods. The sources of difficulties are, namely, the high dimensionality and size of the hyperspectral data, the spectral mixing (linear and nonlinear), and the degradation mechanisms associated to the measurement process such as noise and atmospheric effects. This paper presents a tutorial/overview cross section of some relevant hyperspectral data analysis methods and algorithms, organized in six main topics: data fusion, unmixing, classification, target detection, physical parameter retrieval, and fast computing. In all topics, we describe the state-of-the-art, provide illustrative examples, and point to future challenges and research directions.

1,604 citations


Additional excerpts

  • ...The result is the so-called hyperspectral image (HSI)....

    [...]

10 Jul 1986
TL;DR: In this paper, a multispectral image was modeled as mixtures of reflectance spectra of palagonite dust, gray andesitelike rock, and a coarse rock-like soil.
Abstract: A Viking Lander 1 image was modeled as mixtures of reflectance spectra of palagonite dust, gray andesitelike rock, and a coarse rocklike soil. The rocks are covered to varying degrees by dust but otherwise appear unweathered. Rocklike soil occurs as lag deposits in deflation zones around stones and on top of a drift and as a layer in a trench dug by the lander. This soil probably is derived from the rocks by wind abrasion and/or spallation. Dust is the major component of the soil and covers most of the surface. The dust is unrelated spectrally to the rock but is equivalent to the global-scale dust observed telescopically. A new method was developed to model a multispectral image as mixtures of end-member spectra and to compare image spectra directly with laboratory reference spectra. The method for the first time uses shade and secondary illumination effects as spectral end-members; thus the effects of topography and illumination on all scales can be isolated or removed. The image was calibrated absolutely from the laboratory spectra, in close agreement with direct calibrations. The method has broad applications to interpreting multispectral images, including satellite images.

1,107 citations

Journal ArticleDOI
TL;DR: It is concluded that although various image fusion methods have been proposed, there still exist several future directions in different image fusion applications and the researches in the image fusion field are still expected to significantly grow in the coming years.

871 citations

Journal ArticleDOI
TL;DR: A survey including hyperspectral sensors, inherent data processing and applications focusing both on agriculture and forestry—wherein the combination of UAV and hyperspectrals plays a center role—is presented in this paper.
Abstract: Traditional imagery—provided, for example, by RGB and/or NIR sensors—has proven to be useful in many agroforestry applications. However, it lacks the spectral range and precision to profile materials and organisms that only hyperspectral sensors can provide. This kind of high-resolution spectroscopy was firstly used in satellites and later in manned aircraft, which are significantly expensive platforms and extremely restrictive due to availability limitations and/or complex logistics. More recently, UAS have emerged as a very popular and cost-effective remote sensing technology, composed of aerial platforms capable of carrying small-sized and lightweight sensors. Meanwhile, hyperspectral technology developments have been consistently resulting in smaller and lighter sensors that can currently be integrated in UAS for either scientific or commercial purposes. The hyperspectral sensors’ ability for measuring hundreds of bands raises complexity when considering the sheer quantity of acquired data, whose usefulness depends on both calibration and corrective tasks occurring in pre- and post-flight stages. Further steps regarding hyperspectral data processing must be performed towards the retrieval of relevant information, which provides the true benefits for assertive interventions in agricultural crops and forested areas. Considering the aforementioned topics and the goal of providing a global view focused on hyperspectral-based remote sensing supported by UAV platforms, a survey including hyperspectral sensors, inherent data processing and applications focusing both on agriculture and forestry—wherein the combination of UAV and hyperspectral sensors plays a center role—is presented in this paper. Firstly, the advantages of hyperspectral data over RGB imagery and multispectral data are highlighted. Then, hyperspectral acquisition devices are addressed, including sensor types, acquisition modes and UAV-compatible sensors that can be used for both research and commercial purposes. Pre-flight operations and post-flight pre-processing are pointed out as necessary to ensure the usefulness of hyperspectral data for further processing towards the retrieval of conclusive information. With the goal of simplifying hyperspectral data processing—by isolating the common user from the processes’ mathematical complexity—several available toolboxes that allow a direct access to level-one hyperspectral data are presented. Moreover, research works focusing the symbiosis between UAV-hyperspectral for agriculture and forestry applications are reviewed, just before the paper’s conclusions.

736 citations


Cites background from "Hyperspectral Unmixing Overview: Ge..."

  • ...According to Bioucas-Dias [54] spectral unmixing (see, for example, [74,75]) purpose is to restore the ability of ascertain materials, i....

    [...]

Journal ArticleDOI
TL;DR: In this article, the state-of-the-art multispectral pansharpening techniques for hyperspectral data were compared with some of the state of the art methods for multi-spectral panchambering.
Abstract: Pansharpening aims at fusing a panchromatic image with a multispectral one, to generate an image with the high spatial resolution of the former and the high spectral resolution of the latter. In the last decade, many algorithms have been presented in the literatures for pansharpening using multispectral data. With the increasing availability of hyperspectral systems, these methods are now being adapted to hyperspectral images. In this work, we compare new pansharpening techniques designed for hyperspectral data with some of the state-of-the-art methods for multispectral pansharpening, which have been adapted for hyperspectral data. Eleven methods from different classes (component substitution, multiresolution analysis, hybrid, Bayesian and matrix factorization) are analyzed. These methods are applied to three datasets and their effectiveness and robustness are evaluated with widely used performance indicators. In addition, all the pansharpening techniques considered in this paper have been implemented in a MATLAB toolbox that is made available to the community.

620 citations

References
More filters
Journal ArticleDOI
TL;DR: In this article, a new estimate minimum information theoretical criterion estimate (MAICE) is introduced for the purpose of statistical identification, which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure.
Abstract: The history of the development of statistical hypothesis testing in time series analysis is reviewed briefly and it is pointed out that the hypothesis testing procedure is not adequately defined as the procedure for statistical model identification. The classical maximum likelihood estimation procedure is reviewed and a new estimate minimum information theoretical criterion (AIC) estimate (MAICE) which is designed for the purpose of statistical identification is introduced. When there are several competing models the MAICE is defined by the model and the maximum likelihood estimates of the parameters which give the minimum of AIC defined by AIC = (-2)log-(maximum likelihood) + 2(number of independently adjusted parameters within the model). MAICE provides a versatile procedure for statistical model identification which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure. The practical utility of MAICE in time series analysis is demonstrated with some numerical examples.

47,133 citations


"Hyperspectral Unmixing Overview: Ge..." refers background in this paper

  • ...The identification of the signal subspace is a model order inference problem to which information theoretic criteria like the minimum description length (MDL) [93], [94] or the Akaike information criterion (AIC) [95] comes to mind....

    [...]

Journal ArticleDOI
TL;DR: In this paper, the problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion.
Abstract: The problem of selecting one of a number of models of different dimensions is treated by finding its Bayes solution, and evaluating the leading terms of its asymptotic expansion. These terms are a valid large-sample criterion beyond the Bayesian context, since they do not depend on the a priori distribution.

38,681 citations

Book
D.L. Donoho1
01 Jan 2004
TL;DR: It is possible to design n=O(Nlog(m)) nonadaptive measurements allowing reconstruction with accuracy comparable to that attainable with direct knowledge of the N most important coefficients, and a good approximation to those N important coefficients is extracted from the n measurements by solving a linear program-Basis Pursuit in signal processing.
Abstract: Suppose x is an unknown vector in Ropfm (a digital image or signal); we plan to measure n general linear functionals of x and then reconstruct. If x is known to be compressible by transform coding with a known transform, and we reconstruct via the nonlinear procedure defined here, the number of measurements n can be dramatically smaller than the size m. Thus, certain natural classes of images with m pixels need only n=O(m1/4log5/2(m)) nonadaptive nonpixel samples for faithful recovery, as opposed to the usual m pixel samples. More specifically, suppose x has a sparse representation in some orthonormal basis (e.g., wavelet, Fourier) or tight frame (e.g., curvelet, Gabor)-so the coefficients belong to an lscrp ball for 0

18,609 citations

Book
01 May 1986
TL;DR: In this article, the authors present a graphical representation of data using Principal Component Analysis (PCA) for time series and other non-independent data, as well as a generalization and adaptation of principal component analysis.
Abstract: Introduction * Properties of Population Principal Components * Properties of Sample Principal Components * Interpreting Principal Components: Examples * Graphical Representation of Data Using Principal Components * Choosing a Subset of Principal Components or Variables * Principal Component Analysis and Factor Analysis * Principal Components in Regression Analysis * Principal Components Used with Other Multivariate Techniques * Outlier Detection, Influential Observations and Robust Estimation * Rotation and Interpretation of Principal Components * Principal Component Analysis for Time Series and Other Non-Independent Data * Principal Component Analysis for Special Types of Data * Generalizations and Adaptations of Principal Component Analysis

17,446 citations

Journal ArticleDOI
TL;DR: In this article, a constrained optimization type of numerical algorithm for removing noise from images is presented, where the total variation of the image is minimized subject to constraints involving the statistics of the noise.

15,225 citations


"Hyperspectral Unmixing Overview: Ge..." refers background or methods in this paper

  • ...The limitation imposed to the sparse regression methods by the usual high correlation of the hyperspectral signatures is mitigated in [213], [214] by adding the Total Variation [211] regularization term, applied to the individual bands, to CSR problem (21)....

    [...]

  • ...Work [185] assumes that the endmembers are known and formulates a deconvolution problem, where a Total Variation regularizer [211] is applied to the spatial bands to enhance their resolution....

    [...]

Frequently Asked Questions (20)
Q1. What have the authors contributed in "Hyperspectral unmixing overview: geometrical, statistical, and sparse regression-based approaches" ?

This paper presents an overview of unmixing methods from the time of Keshava and Mustard ’ s unmixing tutorial [ 1 ] to the present. Mathematical problems and potential solutions are described. 

Because of the sparse nature of the chemical spectral components, independent Gamma distributions are elected aspriors for the spectra. 

When formulated as an optimization problem (e.g., implemented by the geometrical-based algorithms detailed in Section IV), spectral unmixing usually relies on algebraic constraints that are inherent to the observationspace : positivity, additivity and minimum volume. 

nonlinear mixing is usually due to physical interactions between the light scattered by multiple materials in the scene. 

RMVES accounts for the noise effects in the observations by employing chance constraints, which act as soft constraints on the fractional abundances. 

In order to put in evidence the impact of the angles between the library vectors, and therefore the mutual coherence of the library [187], in the unmixing results, the authors organize the library into two subsets; the minimum angle between any two spectral signatures is higher the 7 degrees in the first set and lower than 4 in the second set. 

Some of these kernels are designed to be sufficiently flexible to allow several nonlinearity degrees (using, e.g., radial basis functions or polynomials expansions) while others are physics-inspired kernels [55]. 

it may become necessary to include distributions or tree structured representations into sparse processing with libraries. 

They have probably been the most 5 http://www.agc.army.mil/hypercubeoften used in linear hyperspectral unmixing applications, perhaps because of their light computational burden and clear conceptual meaning. 

The spectral unmixing problem has recently been approached in a semi-supervised fashion, by assuming that the observed image signatures can be expressed in the form of linear combinations of a number of pure spectral signatures known in advance [173]–[175] (e.g., spectra collected on the ground by a field spectro-radiometer). 

as a prototypal task, thematic classification of hyperspectral images has recently motivated the development of a new class of algorithms that exploit both the spatial and spectral features contained in image. 

Because of the high level of activity and limited space, there are many methods that have not been addressed directly in this manuscript. 

one of the earliest work dealing with linear unmixing of multi-band images (casted as a soft classification problem) explicitly attempts to highlight spatial correlations between neighboring pixels. 

Convex cone analysis (CCA) [148], finds the boundary points of the data convex cone (it does not apply affine projection), what is very close to MV concept. 

ICA is based on the assumption of mutually independent sources (abundance fractions), which is not the case of hyperspectral data, since the sum of abundance fractions is constant, implying statistical dependence among them. 

The estimates of the endmembers and of the fractional abundances are obtained by a modification of the multiplicative update rules introduced in [147]. 

According to the optimization perspective suggested above, penalizing the volume of the recovered simplex can be conducted by choosing an appropriate negative log-prior . 

For this reason, and also due to the presence of noise and model mismatches, the authors have observed that the CBPDN and CSR often yields better unmixing results than CLS and FCLS. 

In [199], abundance dependencies are modeled using Gaussian Markov random fields, which makes this approach particularly well adapted to unmix images with smooth abundance transition throughout the observed scene. 

The alternating volume maximization (AVMAX) [126], inspired by N-FINDR, maximizes, in a cyclic fashion, the volume of the simplex defined by the endmembers with respect to only one endmember at one time.