scispace - formally typeset

Proceedings ArticleDOI

Matrix Cofactorization for Joint Unmixing and Classification of Hyperspectral Images

01 Sep 2019-pp 1-5

TL;DR: This paper introduces a matrix cofactorization approach to perform spectral unmixing and classification jointly using a proximal alternating linearized minimization algorithm (PALM) ensuring convergence to a critical point.
Abstract: This paper introduces a matrix cofactorization approach to perform spectral unmixing and classification jointly. After formulating the unmixing and classification tasks as matrix factorization problems, a link is introduced between the two coding matrices, namely the abundance matrix and the feature matrix. This coupling term can be interpreted as a clustering term where the abundance vectors are clustered and the resulting attribution vectors are then used as feature vectors. The overall non-smooth, non-convex optimization problem is solved using a proximal alternating linearized minimization algorithm (PALM) ensuring convergence to a critical point. The quality of the obtained results is finally assessed by comparison to other conventional algorithms on semi-synthetic yet realistic dataset.
Topics: Matrix decomposition (61%), Matrix (mathematics) (56%), Feature vector (55%), Cluster analysis (51%)

Summary (2 min read)

Introduction

  • Index Terms—supervised learning, spectral unmixing, cofactorization, hyperspectral images.
  • In particular classification algorithms received a lot of attention from the scientific community.
  • In the specific case of hyperspectral images (HSI), images capture a very rich signal since each pixel is a sampling of the reflectance spectrum of the corresponding area, typically in the visible and infrared spectral domains with hundreds of measurements.
  • The core concept is to express the two problems of interest, namely spectral unmixing and classification, as factorization problems and then to introduce a coupling term to intertwine the two estimations.
  • Finally, the method is tested and compared to other unmixing and classification methods in Section IV.

II. PROBLEM STATEMENT

  • As presented in Sections II-A and II-B, spectral unmixing and supervised classification are commonly expressed as factorization problems.
  • In the proposed model, the link is made between the abundance matrix and the feature matrix.
  • More precisely, the coupling term is expressed as a clustering term over the abundance vectors where the attribution vectors to the clusters are also the feature vectors of the classification as detailed in Section II-C.

A. Spectral unmixing

  • These abundance vectors describe the mixture contained in the pixel.
  • In addition to the data fitting term, two penalization terms are considered in the proposed unmixing model.
  • The term ı R R×P + (A) enforces a nonnegativity constraint, ensuring an additive decomposition of the spectra.
  • The second penalization λa ‖A‖1 is a sparsity penalization promoting the concept that only a few endmembers are active in a given pixel.
  • In the following work, the choice has been made to discard the estimation of the endmember matrix for the sake of simplicity.

B. Classification

  • Numerous decision rules have been proposed to carry out classification.
  • The weighing coefficients dp adjust the cost function with respect to the sizes of the training and test sets, in particular in the case of unbalanced classes.
  • Moreover, the nonlinear mapping φ(·) is chosen as a sigmoid, which makes the proposed classifier interpretable as a one layer neural network.
  • The second considered penalization is a spatial regularization enforced through a smoothed weighted vectorial total variation norm (vTV).
  • They are computed beforehand using external data containing information on the spatial structures, e.g., a panchromatic image or a LIDAR image [11].

C. Clustering

  • To define a global cofactorization problem, a relation is drawn between the activation matrices of the two factorization problems, namely the abundance matrix and the feature matrix.
  • Abundances vectors are clustered and the resulting attribution vectors are then used as feature vectors for the classification.
  • Thus, the resulting clustering method is a particular instance of kmeans where the attribution vectors are relaxed and can be interpreted as the collection of probabilities to belong to each of the clusters.

D. Multi-objective problem

  • The two factorization problems corresponding to the spectral unmixing and classification tasks have been expressed and the link between these two problems has been set up through the clustering term.

III. OPTIMIZATION SCHEME

  • The proposed global optimization problem (8) is nonconvex and non-smooth.
  • Such problem are usually very challenging to solve.
  • The concept of this algorithm is to perform a proximal gradient descent according to each variable alternatively.
  • In the present case, the partial gradients is easily computed and all globally Lipschitz.
  • As for the proximal operators, they are are well-known [12] except for f0(·).

IV. EXPERIMENTS

  • Data generation – The HSI used to perform the experiments is a semi-synthetic image.
  • For the last hyperparameter λ̃c, two values have been considered 0. and 0.1, standing respectively for the case without and with spatial regularization.
  • It should be noted that all unmixing methods use directly the correct endmember matrix M which has been used to generate the data.
  • Processing time is indeed higher for the proposed cofactorization method than for RF, FCLS and CBPDN.
  • In terms of qualitative results, Figure 3 presents the classification maps which appear consistent with the quantitative results.

V. CONCLUSION AND PERSPECTIVE

  • This paper introduces a unified framework to perform jointly spectral unmixing and classification by the mean of a cofactorization problem.
  • The overall cofactorization task is formulated as a non-convex nonsmooth optimization problem whose solution was approximated thanks to a PALM algorithm which ensured some convergence guarantees.
  • Geosci. Remote Sens., vol. 54, no. 10, pp. 6232–6251, 2016. [2].
  • A. Villa, J. Chanussot et al., “Spectral unmixing for the classification of hyperspectral images at a finer spatial resolution,” IEEE J. Sel. Top. Signal Process., vol. 5, no.
  • J. Bolte, S. Sabach et al., “Proximal alternating linearized minimization for nonconvex and nonsmooth problems,” Mathematical Programming, vol.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

O
fficial URL
DOI : https://doi.org/10.23919/EUSIPCO.2019.8903037
Any correspondence concerning this service should be sent
to the repository administrator: tech-oatao@listes-diff.inp-toulouse.fr
This is an author’s version published in:
http://oatao.univ-toulouse.fr/24982
Open
Archive
Toulouse
Archive
Ouverte
OATAO is an open access repository that collects the work of Toulouse
researchers and makes it freely available over the web where possible
To cite this version: Lagrange, Adrien and Fauvel, Mathieu and
May, Stéphane and Bioucas-Dias, José M. and Dobigeon, Nicolas
Matrix Cofactorization for Joint Unmixing and Classification of
Hyperspectral Images. (2019) In: 27th European Signal Processing
Conference (EUSIPCO 2019), 2 September 2019 - 6 September 2019
(A Coruna, Spain).

Abstract—This paper introduces a matrix cofactorization ap-
proach to perform spectral unmixing and classification jointly.
After formulating the unmixing and classification tasks as matrix
factorization problems, a link is introduced between the two
coding matrices, namely the abundance matrix and the feature
matrix. This coupling term can be interpreted as a clustering
term where the abundance vectors are clustered and the resulting
attribution vectors are then used as feature vectors. The overall
non-smooth, non-convex optimization problem is solved using a
proximal alternating linearized minimization algorithm (PALM)
ensuring convergence to a critical point. The quality of the
obtained results is finally assessed by comparison to other
conventional algorithms on semi-synthetic yet realistic dataset.
Index Terms—supervised learning, spectral unmixing, cofac-
torization, hyperspectral images.
I. I
NTRODUCTION
Following the fast increase of available remote sensing
images, many methods have been proposed to extract infor-
mation from such specific data. In particular classification
algorithms received a lot of attention from the scientific
community. The emergence of state-of-the-art algorithms such
as convolutional neural network [1] or random forest [2]
have brought unprecedented good results. In the so-called
supervised classification framework, these algorithms make it
possible to infer, from a reduced number of examples provided
by an expert, a classification rule. This rule is then used to
attribute to unknown pixels a class among a predefined set of
classes. Although very efficient, classification methods remain
a limited analysis of the image since it only attributes a single
class to each pixel when it is sometimes possible to extract
more information. In the specific case of hyperspectral images
(HSI), images capture a very rich signal since each pixel is
a sampling of the reflectance spectrum of the corresponding
area, typically in the visible and infrared spectral domains
with hundreds of measurements. To fully exploit the available
information, it is interesting to resort to alternative methods of
interpretation such as representation learning methods, namely
spectral unmixing in the case of HSI [3]. Spectral unmixing is
Part of this work has been supported Centre National d’
´
Etudes Spatiales
(CNES), Occitanie Region, EU FP7 through the ERANETMED JC-WATER
program, MapInvPlnt Project ANR-15-NMED-0002-02 and ANR-3IA Artifi-
cial and Natural Intelligence Toulouse Institute (ANITI).
a physic-based model which assumes that a given pixel, i.e. a
given measured spectrum, is the result of the combination of
a reduced number of elementary spectra called endmembers,
specific to a given material. The aim of unmixing methods is
to infer the proportion of each material present in the pixel.
The obtained abundance maps display the spatial distribution
of the material in the observed scene.
Even if classification and spectral unmixing are two widely-
used techniques, very few attempts have been made to com-
bine them. Most of these works [4], [5] intend to improve
classification results by using spectral unmixing to identify
mixed pixels and then process specifically the identified mixed
pixels. Instead of using the two methods sequentially, the
method proposed in this paper introduces the idea of a joint
unmixing and classification. This method is formulated as a
cofactorization problem, which is known to produce valuable
results in many application fields such as music source separa-
tion [6], and image analysis [7]. The core concept is to express
the two problems of interest, namely spectral unmixing and
classification, as factorization problems and then to introduce
a coupling term to intertwine the two estimations. Similarly
to [8], the coupling term is defined as a clustering term where
the abundance vectors provided by the unmixing step are
clustered and the resulting attribution vectors are then used as
feature vectors for the classification. The overall optimization
problem is non-convex non-smooth. Such problems are known
to be challenging to solve but, building on recent advances in
optimization, the PALM algorithm proposed in [9] is used as
an optimization scheme, thus guaranteeing convergence to a
critical point of the objective function.
The rest of this paper is organized as follows. Section II
defines the two factorization tasks and introduces the global
cofactorization problem. Then, the method used to minimize
the resulting criterion is presented in Section III. Finally,
the method is tested and compared to other unmixing and
classification methods in Section IV. Section V draws some
conclusions and perspectives.
II. P
ROBLEM STATEMENT
As presented in Sections II-A and II-B, spectral unmixing
and supervised classification are commonly expressed as fac-
torization problems. We propose to derive a unified framework
Matrix cofactorization for joint unmixing and
classification of hyperspectral images
Adrien Lagrange
, Mathieu Fauvel
, St
´
ephane May
, Jose
´
M. Bioucas-Dias
and Nicolas Dobigeon
IRIT/INP-ENSEEIHT, University of Toulouse, Toulouse, France
Centre d’
´
Etudes Spatiales de la BIOsph
`
ere (CESBIO), INRA, Toulouse, France
Centre National d’
´
Etudes Spatiales (CNES), DCT/SI/AP, Toulouse, France
Instituto de Telecomunicac¸
˜
oes, Instituto Superior Te
´
cnico, Universidade de Lisboa, 1049-001 Lisbon, Portugal
firstname.name@{enseeiht,inra,cnes,enseeiht}.fr, bioucas@lx.it.pt

by considering a global cofactorization problem. It relies on
a link between the two factorization problems in order to
perform a joint estimation. In the proposed model, the link
is made between the abundance matrix and the feature matrix.
More precisely, the coupling term is expressed as a clustering
term over the abundance vectors where the attribution vectors
to the clusters are also the feature vectors of the classification
as detailed in Section II-C.
A. Spectral unmixing
Each pixel of an HSI is a L-dimensional measurement of
a reflectance spectrum. Physics models this spectrum as a
combination of R elementary spectrum, gathered in the so-
called endmember matrix M R
L×R
, each characterizing a
specific material. The spectral unmixing task aims at retrieving
the so-called abundance vectors a
p
R
R
, with R L,
from the spectrum y
p
R
L
of the pth pixel (p P where
P ! {1, . . . , P } is the set of pixel indexes). These abundance
vectors describe the mixture contained in the pixel. Using
the conventional linear mixture model, the spectral unmixing
problem can be expressed as follow
min
M,A
1
2
&Y MA&
2
F
+ λ
a
&A&
1
+ ı
R
R×P
+
(A) (1)
where matrix Y R
L×P
gathers the P pixel spectra and A
R
R×P
the abundance vectors. In addition to the data fitting
term, two penalization terms are considered in the proposed
unmixing model. The term ı
R
R×P
+
(A) enforces a nonnegativity
constraint, ensuring an additive decomposition of the spectra.
The second penalization λ
a
&A&
1
is a sparsity penalization
promoting the concept that only a few endmembers are active
in a given pixel. In the following work, the choice has been
made to discard the estimation of the endmember matrix for
the sake of simplicity. The endmember matrix is assumed to
be known or estimated beforehand.
B. Classification
In the context of supervised classification, a subset of pixels
is available with their corresponding groundtruth. The index
subset of labeled pixel is denoted hereafter L while the index
subset of unlabeled pixel is U ( L U = and L U = P).
Classification intends to assign one of the C classes to each
pixel. In practice, classifying can be formulated as estimating
a C × P matrix C whose columns correspond to unknown
C-dimensional attribution vectors c
p
(p U). Each vector is
made of 0 except for c
i,p
= 1 when the pth pixel is assigned
the ith class. Numerous decision rules have been proposed
to carry out classification. Most of them rely on the use of
feature vectors z
p
R
K
(p P) associated with the P
pixels, gathered in the matrix Z R
K×P
. Considering a
linear classifier parametrized by the matrix Q R
C×K
, a
vector-wise nonlinear mapping φ(·), such as a sigmoid or a
softmax operator, is then applied to the output of the classifier.
Finally the classification rule can be expressed as the matrix
factorization problem
min
Q,C
U
J
c
(C, φ(QZ)) + ı
S
|U|
C
(C
U
) (2)
where J
c
(·, ·) is a cost function measuring the quality of the
estimated attribution vectors φ(Qz
p
) and and S
C
is the C-
dimensional probability simplex ensuring nonnegativity and
sum-to-one constraints of the attribution vectors. In this work,
the cost function J
c
(·, ·) has been chosen as the cross-entropy,
defined in a multi-class problem as
J
c
(C,
ˆ
C) =
!
p∈P
d
p
!
i∈C
c
i,p
log c
i,p
) (3)
with
d
p
=
"
1
|L
i
|
, if p L
i
,
1
|U|
, if p U,
(4)
where L
i
is the subset of labeled pixels belonging to class i,
ˆ
c
p
is the estimated attribution vector and c
p
the true one. The
weighing coefficients d
p
adjust the cost function with respect
to the sizes of the training and test sets, in particular in the
case of unbalanced classes. This particular loss function has
been extensively used in the context of neural networks [10].
Moreover, the nonlinear mapping φ(·) is chosen as a sigmoid,
which makes the proposed classifier interpretable as a one
layer neural network.
To consider a more elaborate case, it is also possible to add a
set of penalizations/constraints. In particular, a penalization of
the classifier parameters Q is considered to prevent an artificial
decrease of the loss function. This penalization is based on
a Frobenius-norm and is well-known in the neural network
community where it is referred to as weight decay. The second
considered penalization is a spatial regularization enforced
through a smoothed weighted vectorial total variation norm
(vTV). This regularization promotes a piece-wise constant
solution for the classification map C. The overall resulting
problem can be written
min
Q,C
U
!
p∈P
d
p
!
i∈C
c
i,p
log
#
1
1 + exp(q
i:
z
p
)
$
+ λ
q
&Q&
2
F
+ λ
c
&C&
vTV
+ ı
S
|U|
C
(C
U
) (5)
where q
i:
is the i-th line of Q, λ
q
and λ
c
weight the
regularization terms and
&C&
vTV
=
P
!
p=1
β
p
%
&
&
&
[
h
C]
p
&
&
&
2
+
&
&
&
[
v
C]
p
&
&
&
2
+ ǫ (6)
where ǫ > 0 is a smoothing parameter and [
h
(·)]
p
and
[
v
(·)]
p
denote horizontal and vertical discrete gradients
1
[
h
C]
(m,n)
= c
(m+1,n)
c
(m,n)
[
v
C]
(m,n)
= c
(m,n+1)
c
(m,n)
.
The weighting coefficients β
m,n
are introduced to account for
the natural boundaries present in the image. They are com-
puted beforehand using external data containing information
on the spatial structures, e.g., a panchromatic image or a
LIDAR image [11]. An example of such weights is described
in Section IV.
1
With a slight abuse of notations, c
(m,n)
refers to the pth column of C
where the pth pixel is spatially indexed by (m, n).

UNMIXING CLUSTERING CLASSIFICATION
Y
A
M
Image
Abund.
Endm.
min
Z
&A BZ&
2
F
C
L
C
U
Z
Q
Classification
Features
Classifier
Fig. 1. Structure of the cofactorization model. Variables in blue stand for
observations or available external data. Variables in olive green are linked
through the clustering term. The variable in a dotted box is assumed to be
known beforehand.
C. Clustering
T
o define a global cofactorization problem, a relation is
drawn between the activation matrices of the two factorization
problems, namely the abundance matrix and the feature matrix.
More specifically, following the idea developed in [8], a clus-
tering term is introduced as a coupling. Abundances vectors
are clustered and the resulting attribution vectors are then
used as feature vectors for the classification. Ideally, clustering
attribution vectors z
p
R
K
are filled with zeros except for
z
k,p
= 1 when a
p
is associated with the kth cluster. The well-
known k-means is chosen to perform this task since it is easily
expressed as an optimization problem
min
Z,B
1
2
&A BZ&
2
F
+ ı
S
P
K
(Z) + ı
R
R×K
+
(B) (7)
where columns of B R
R×K
stands for the centroids of
the K clusters. Two constraints are considered in this k-
means clustering problem: i) a positivity constraint on B since
centroids are expected to be interpretable as mean abundance
vectors and ii) the vectors z
p
(p P) are assumed to be
defined on the K-dimensional probability simplex S
K
. Thus,
the resulting clustering method is a particular instance of k-
means where the attribution vectors are relaxed and can be
interpreted as the collection of probabilities to belong to each
of the clusters.
D. Multi-objective problem
The two factorization problems corresponding to the spec-
tral unmixing and classification tasks have been expressed and
the link between these two problems has been set up through
the clustering term. The global cofactorization problem, illus-
trated in Figure 1, is finally formulated as
min
A,Q,Z
C
U
,B
λ
0
2
&Y MA&
2
F
+ λ
a
&A&
1
+ ı
R
R×P
+
(A)
λ
1
2
!
p∈P
d
p
!
i∈C
c
i,p
log
#
1
1 + exp(q
i:
z
p
)
$
+
λ
q
2
&Q&
2
F
+ λ
c
&C&
vTV
+ ı
S
|U|
C
(C
U
)
+
λ
2
2
&A BZ&
2
F
+ ı
S
P
K
(Z) + ı
R
R×K
+
(B) (8)
where λ
0
, λ
1
and λ
2
are introduced to weight the contribution
of the various terms.
III. O
PTIMIZATION SCHEME
The proposed global optimization problem (8) is non-
con
vex and non-smooth. Such problem are usually very chal-
lenging to solve. To handle it, we propose to resort to the
PALM algorithm proposed in [9]. PALM algorithm ensures
convergence to a critical point, i.e., a local minimum of the ob-
jective function. To apply PALM, the objective is rewritten as
a sum of independent non-smooth terms f
j
(·) (j {1, . . . , 3})
and a smooth coupling term g(·)
min
A,B,Z,
Q,C
U
f
0
(A)+f
1
(B)+f
2
(Z)+f
3
(C
U
)+g(A, B, Z, C
U
, Q)
where
f
0
(A) = ı
R
+
(A) + λ
a
&A&
1
, f
1
(B) = ı
R
+
(B)
f
2
(Z) = ı
S
P
K
(Z), f
3
(C
U
) = ı
S
|U|
K
(C
U
)
g(A, B, Z, C
U
, Q) =
λ
0
2
&Y MA&
2
F
λ
1
2
!
p∈P
d
p
!
i∈C
c
i,p
log
#
1
1 + exp(q
i:
z
p
)
$
+
λ
2
2
&A BZ&
2
F
+
λ
q
2
&Q&
2
F
+ λ
c
&C&
vTV
.
Algorithm 1: PALM
1 Initialize variables A
0
, B
0
, Z
0
, C
U
0
and Q
0
;
2 Set α > 1;
3 while stopping criterion not reached do
4 A
k+1
prox
αL
A
f
0
(A
k
1
αL
A
A
g(A
k
, B
k
, Z
k
, C
k
U
, Q
k
));
5 B
k+1
prox
αL
B
f
1
(B
k
1
αL
B
B
g(A
k+1
, B
k
, Z
k
, C
k
U
, Q
k
));
6 Z
k+1
prox
αL
Z
f
2
(Z
k
1
αL
Z
Z
g(A
k+1
, B
k+1
, Z
k
, C
k
U
, Q
k
));
7 Q
k+1
prox
αL
Q
f
3
(Q
k
1
αL
Q
Q
g(A
k+1
, B
k+1
, Z
k+1
, C
U
k
, Q
k
));
8 C
k+1
U
prox
αL
C
U
f
4
(C
k
U
1
αL
C
U
C
U
g(A
k+1
, B
k+1
, Z
k+1
, C
k
U
, Q
k+1
));
9 end
10 return A
end
, B
end
, Z
end
, Q
end
, C
end
U
The concept of this algorithm is to perform a proximal
gradient descent according to each variable alternatively. To
apply PALM, the functions f
j
(·) have to be proper, lower
semi-continuous, extended real-valued. A sufficient condition
on the function g(·) is to be C
2
, i.e., with continuous first
and second derivatives, and its partial gradients have to be
globally Lipschitz. L
X
denotes herein the Lipschitz constant
associated to the partial gradient according to X. The detailed
steps of the algorithm are summarized in Algorithm 1 and
further theoretical details are available in [9].
In practice, one needs to be able to compute the partial
gradient and its associated Lipschitz constant to perform the
gradient descent. It is also necessary to compute the proximal
operator associated to the non-smooth terms. In the present
case, the partial gradients is easily computed and all globally
Lipschitz. The only problematic term is the vTV term which
is not globally Lipschitz in its canonical form. To alleviate,

a smoothed counterpart has been introduced in (6) with a
smoothing parameter ǫ R
+
. As for the proximal operators,
they are are well-known [12] except for f
0
(·). For f
0
(·), it
is necessary to resort to the composition of the proximal
operators associated to the non-negative constraint and the
1
-
norm, which is here possible according to [13].
IV. EXPERIMENTS
Data generation The
HSI used to perform the experiments is
a semi-synthetic image. More specifically, the image has been
generated using a real HSI. The real image has been unmixed
using a fully constrained least square (FCLS) algorithm [14]
using R = 5 endmembers extracted with the well-known VCA
algorithm [15]. The obtained abundance maps have then been
used to generate a new synthetic image using pure spectra
from the hyperspectral library ASTER [16]. The groundtruth
of the original data, composed of C = 3 classes has been
preserved to assess the quality of the classification. A color
composition, a panchromatic version and the groundtruth are
presented in Figure 2. The subset of the image used as training
data is as also shown in Figure 2.
(a) (b) (c) (d)
Fig. 2. Synthetic image: (a) colored composition of the HSI Y, (b)
panchromatic image y
PAN
, (c) classification ground-truth, (d) training set.
Initialization and convergence As
stated before, cofac-
torization is a non-convex problem and PALM only ensures
convergence to a local minimum of the objective function.
It is thus important to carefully initialize the estimated vari-
ables in order to reach a relevant solution. In the presented
experiment, abundance matrix A
0
has been initialized by
solving min
AR
R×P
+
&Y MA&
2
F
using a projected gradient
algorithm. Then, a k-means algorithm has been applied to the
obtained abundance vectors and the resulting centroids and
attribution vectors have been used to initialize B
0
and Z
0
.
On the other hand, classifier parameters Q
0
and classification
matrix C
0
U
have been initialized randomly.
In order to assess the convergence of the optimization
scheme, the normalized difference between two consecutive
values of the objective function is monitored. When this
value reach a certain threshold (10
4
for this experiment), the
optimization process stops and the last estimation is assumed
to be close enough to the solution.
Hyperparameters Multiple hyperparameters λ
·
have been
introduced in problem (8) to weight the various terms of
the objective function. For practical use, these parameters
have been normalized by the size and dynamics of the cor-
responding variables. These normalized parameters, denoted
TABLE I
U
N
MIXING AND CLASSIFICATION RESULTS.
Model Kappa F1-mean RMSE(
ˆ
A) RE Time (s)
RF 0.817 0.842 N\A N\A 0.4
FCLS N\A N\A 0.0701 0.224 1.2
CBPDN N\A N\A 0.0792 0.229 2
D-KSVD 0.494 0.554 N\A 0.923 70
Cofact. 0.847 0.870 0.0504 0.750 180
Cofact. + vTV 0.874 0.895 0.0526 0.752 81
˜
λ
·
, have been empirically tuned to obtain consistent results
(
˜
λ
0
=
˜
λ
1
=
˜
λ
2
= 1,
˜
λ
a
= 10
3
,
˜
λ
q
= 0.15). For the last
hyperparameter
˜
λ
c
, two values have been considered 0. and
0.1, standing respectively for the case without and with spatial
regularization. The definition of the vTV regularization also
includes parameters which has to be properly set. First, the
smoothing parameter is set to ǫ = 0.01 to ensure the gradient-
Lipschitz property without modifying substantially the TV-
norm. Secondly, it is necessary to define the weighing coeffi-
cients β
m,n
. They have been computed from a panchromatic
image y
PAN
, shown in Figure 2, generated by normalizing
hyperspectral bands by their mean and then summing them.
More precisely, to account for possible homogeneous areas in
the image, they are defined as follows
β
p
=
˜
β
p
'
q
˜
β
q
with
˜
β
q
=
(
&
&
&
[y
PAN
]
q
&
&
&
2
+ σ
)
1
where σ = 0.01 controls the variation of the weights and
avoids numerical issues.
Compared methods To assess the quality of the unmix-
ing and classification results, the proposed method has been
compared to several well-known unmixing and classification
algorithms. Regarding classification, we considered the ran-
dom forest (RF) algorithm, known to perform very well to
classify HSI. Parameters of the RF (number of trees, depth)
have been adjusted using gridsearch and cross-validation. The
discriminative K-SVD (D-KSVD) method has been used as a
benchmark [17]. This model is also a cofactorization method
but with a simpler approach where the two coding matrices A
and Z are imposed to be equal. In this case, the first term is not
a spectral unmixing task but rather a dictionary learning task
where dictionary elements are assumed to be discriminative
for the classification task. Only a sparsity penalization is
considered for D-KSVD using a
0
-norm.
As for the unmixing comparison, we considered two meth-
ods described in [14]. The first method is the fully constrained
least square method (FCLS) where the corresponding opti-
mization problem is defined as the data fitting term with a
positivity and sum-to-one constraint on abundance vectors a
p
.
The second method is the constrained basis pursuit denoising
(CBPDN) corresponding to problem 1. The hyperparameter
λ
a
, weighting the sparsity penalty is also adjusted using
gridsearch and cross-validation. It should be noted that all
unmixing methods use directly the correct endmember matrix
M which has been used to generate the data. Additionally, the
endmember matrix is used to initialize the dictionary of the
D-KSVD method.

Citations
More filters


References
More filters

Journal Article

28,684 citations


"Matrix Cofactorization for Joint Un..." refers background in this paper

  • ...This particular loss function has been extensively used in the context of neural networks [10]....

    [...]


BookDOI
Russell G. Congalton, Kass Green1Institutions (1)
17 Sep 1998-
TL;DR: This chapter discusses Accuracy Assessment, which examines the impact of sample design on cost, statistical Validity, and measuring Variability in the context of data collection and analysis.
Abstract: Introduction Why Accuracy Assessment? Overview Historical Review Aerial Photography Digital Assessments Data Collection Considerations Classification Scheme Statistical Considerations Data Distribution Randomness Spatial Autocorrelation Sample Size Sampling Scheme Sample Unit Reference Data Collection Basic Collection Forms Basic Analysis Techniques Non-Site Specific Assessments Site Specific Assessments Area Estimation/Correction Practicals Impact of Sample Design on Cost Recommendations for Collecting Reference Data ASources of Variation in Reference Data Photo Interpretation vs. Ground Visitation Interpreter Variability Observations vs. Measurements What is Correct? Labeling Map vs. Labeling the Reference Data Qualitative vs. Quantitative Analysis Local vs. Regional vs. Global Assessments Advanced Topics Beyond the Error Matrix Modifying the Error Matrix Fuzzy Set Theory Measuring Variability Complex Data Sets Change Detection Multi-Layer Assessments California Hardwood Rangeland Monitoring Project Case Study Balancing Statistical Validity with Practical Reality Bibliography

4,394 citations


"Matrix Cofactorization for Joint Un..." refers methods in this paper

  • ...To evaluate the classification accuracy, two conventional metrics are used, namely Cohen’s kappa coefficient and the averaged F1-score over all classes [18]....

    [...]


Journal ArticleDOI
José M. P. Nascimento, J.M.B. Dias1Institutions (1)
TL;DR: A new method for unsupervised endmember extraction from hyperspectral data, termed vertex component analysis (VCA), which competes with state-of-the-art methods, with a computational complexity between one and two orders of magnitude lower than the best available method.
Abstract: Given a set of mixed spectral (multispectral or hyperspectral) vectors, linear spectral mixture analysis, or linear unmixing, aims at estimating the number of reference substances, also called endmembers, their spectral signatures, and their abundance fractions. This paper presents a new method for unsupervised endmember extraction from hyperspectral data, termed vertex component analysis (VCA). The algorithm exploits two facts: (1) the endmembers are the vertices of a simplex and (2) the affine transformation of a simplex is also a simplex. In a series of experiments using simulated and real data, the VCA algorithm competes with state-of-the-art methods, with a computational complexity between one and two orders of magnitude lower than the best available method.

2,090 citations


"Matrix Cofactorization for Joint Un..." refers methods in this paper

  • ...The real image has been unmixed using a fully constrained least square (FCLS) algorithm [14] using R = 5 endmembers extracted with the well-known VCA algorithm [15]....

    [...]


Journal ArticleDOI
TL;DR: This paper presents an overview of un Mixing methods from the time of Keshava and Mustard's unmixing tutorial to the present, including Signal-subspace, geometrical, statistical, sparsity-based, and spatial-contextual unmixed algorithms.
Abstract: Imaging spectrometers measure electromagnetic energy scattered in their instantaneous field view in hundreds or thousands of spectral channels with higher spectral resolution than multispectral cameras. Imaging spectrometers are therefore often referred to as hyperspectral cameras (HSCs). Higher spectral resolution enables material identification via spectroscopic analysis, which facilitates countless applications that require identifying materials in scenarios unsuitable for classical spectroscopic analysis. Due to low spatial resolution of HSCs, microscopic material mixing, and multiple scattering, spectra measured by HSCs are mixtures of spectra of materials in a scene. Thus, accurate estimation requires unmixing. Pixels are assumed to be mixtures of a few materials, called endmembers. Unmixing involves estimating all or some of: the number of endmembers, their spectral signatures, and their abundances at each pixel. Unmixing is a challenging, ill-posed inverse problem because of model inaccuracies, observation noise, environmental conditions, endmember variability, and data set size. Researchers have devised and investigated many models searching for robust, stable, tractable, and accurate unmixing algorithms. This paper presents an overview of unmixing methods from the time of Keshava and Mustard's unmixing tutorial to the present. Mixing models are first discussed. Signal-subspace, geometrical, statistical, sparsity-based, and spatial-contextual unmixing algorithms are described. Mathematical problems and potential solutions are described. Algorithm characteristics are illustrated experimentally.

1,979 citations


"Matrix Cofactorization for Joint Un..." refers methods in this paper

  • ...More specifically, the image has been generated using a real HSI....

    [...]

  • ...Regarding classification, we considered the random forest (RF) algorithm, known to perform very well to classify HSI....

    [...]

  • ...Each pixel of an HSI is a L-dimensional measurement of a reflectance spectrum....

    [...]

  • ...In the specific case of hyperspectral images (HSI), images capture a very rich signal since each pixel is a sampling of the reflectance spectrum of the corresponding area, typically in the visible and infrared spectral domains with hundreds of measurements....

    [...]

  • ...To fully exploit the available information, it is interesting to resort to alternative methods of interpretation such as representation learning methods, namely spectral unmixing in the case of HSI [3]....

    [...]


Journal ArticleDOI
Mariana Belgiu1, Lucian Drăguţ2Institutions (2)
TL;DR: This review has revealed that RF classifier can successfully handle high data dimensionality and multicolinearity, being both fast and insensitive to overfitting.
Abstract: A random forest (RF) classifier is an ensemble classifier that produces multiple decision trees, using a randomly selected subset of training samples and variables. This classifier has become popular within the remote sensing community due to the accuracy of its classifications. The overall objective of this work was to review the utilization of RF classifier in remote sensing. This review has revealed that RF classifier can successfully handle high data dimensionality and multicolinearity, being both fast and insensitive to overfitting. It is, however, sensitive to the sampling design. The variable importance (VI) measurement provided by the RF classifier has been extensively exploited in different scenarios, for example to reduce the number of dimensions of hyperspectral data, to identify the most relevant multisource remote sensing and geographic data, and to select the most suitable season to classify particular target classes. Further investigations are required into less commonly exploited uses of this classifier, such as for sample proximity analysis to detect and remove outliers in the training samples.

1,862 citations


Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20031