scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Matrix Cofactorization for Joint Unmixing and Classification of Hyperspectral Images

TL;DR: This paper introduces a matrix cofactorization approach to perform spectral unmixing and classification jointly using a proximal alternating linearized minimization algorithm (PALM) ensuring convergence to a critical point.
Abstract: This paper introduces a matrix cofactorization approach to perform spectral unmixing and classification jointly. After formulating the unmixing and classification tasks as matrix factorization problems, a link is introduced between the two coding matrices, namely the abundance matrix and the feature matrix. This coupling term can be interpreted as a clustering term where the abundance vectors are clustered and the resulting attribution vectors are then used as feature vectors. The overall non-smooth, non-convex optimization problem is solved using a proximal alternating linearized minimization algorithm (PALM) ensuring convergence to a critical point. The quality of the obtained results is finally assessed by comparison to other conventional algorithms on semi-synthetic yet realistic dataset.

Summary (2 min read)

Introduction

  • In particular classification algorithms received a lot of attention from the scientific community.
  • In the specific case of hyperspectral images (HSI), images capture a very rich signal since each pixel is a sampling of the reflectance spectrum of the corresponding area, typically in the visible and infrared spectral domains with hundreds of measurements.
  • The core concept is to express the two problems of interest, namely spectral unmixing and classification, as factorization problems and then to introduce a coupling term to intertwine the two estimations.
  • Finally, the method is tested and compared to other unmixing and classification methods in Section IV.

II. PROBLEM STATEMENT

  • As presented in Sections II-A and II-B, spectral unmixing and supervised classification are commonly expressed as factorization problems.
  • In the proposed model, the link is made between the abundance matrix and the feature matrix.
  • More precisely, the coupling term is expressed as a clustering term over the abundance vectors where the attribution vectors to the clusters are also the feature vectors of the classification as detailed in Section II-C.

B. Classification

  • Numerous decision rules have been proposed to carry out classification.
  • The weighing coefficients dp adjust the cost function with respect to the sizes of the training and test sets, in particular in the case of unbalanced classes.
  • Moreover, the nonlinear mapping φ(·) is chosen as a sigmoid, which makes the proposed classifier interpretable as a one layer neural network.
  • The second considered penalization is a spatial regularization enforced through a smoothed weighted vectorial total variation norm (vTV).
  • They are computed beforehand using external data containing information on the spatial structures, e.g., a panchromatic image or a LIDAR image [11].

C. Clustering

  • To define a global cofactorization problem, a relation is drawn between the activation matrices of the two factorization problems, namely the abundance matrix and the feature matrix.
  • Abundances vectors are clustered and the resulting attribution vectors are then used as feature vectors for the classification.
  • Thus, the resulting clustering method is a particular instance of kmeans where the attribution vectors are relaxed and can be interpreted as the collection of probabilities to belong to each of the clusters.

III. OPTIMIZATION SCHEME

  • The proposed global optimization problem (8) is nonconvex and non-smooth.
  • Such problem are usually very challenging to solve.
  • The concept of this algorithm is to perform a proximal gradient descent according to each variable alternatively.
  • In the present case, the partial gradients is easily computed and all globally Lipschitz.
  • As for the proximal operators, they are are well-known [12] except for f0(·).

IV. EXPERIMENTS

  • Data generation – The HSI used to perform the experiments is a semi-synthetic image.
  • For the last hyperparameter λ̃c, two values have been considered 0. and 0.1, standing respectively for the case without and with spatial regularization.
  • It should be noted that all unmixing methods use directly the correct endmember matrix M which has been used to generate the data.
  • Processing time is indeed higher for the proposed cofactorization method than for RF, FCLS and CBPDN.
  • In terms of qualitative results, Figure 3 presents the classification maps which appear consistent with the quantitative results.

V. CONCLUSION AND PERSPECTIVE

  • This paper introduces a unified framework to perform jointly spectral unmixing and classification by the mean of a cofactorization problem.
  • The overall cofactorization task is formulated as a non-convex nonsmooth optimization problem whose solution was approximated thanks to a PALM algorithm which ensured some convergence guarantees.
  • A. Villa, J. Chanussot et al., “Spectral unmixing for the classification of hyperspectral images at a finer spatial resolution,” IEEE J. Sel. Top. Signal Process., vol. 5, no.
  • J. Bolte, S. Sabach et al., “Proximal alternating linearized minimization for nonconvex and nonsmooth problems,” Mathematical Programming, vol.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

O
fficial URL
DOI : https://doi.org/10.23919/EUSIPCO.2019.8903037
Any correspondence concerning this service should be sent
to the repository administrator: tech-oatao@listes-diff.inp-toulouse.fr
This is an author’s version published in:
http://oatao.univ-toulouse.fr/24982
Open
Archive
Toulouse
Archive
Ouverte
OATAO is an open access repository that collects the work of Toulouse
researchers and makes it freely available over the web where possible
To cite this version: Lagrange, Adrien and Fauvel, Mathieu and
May, Stéphane and Bioucas-Dias, José M. and Dobigeon, Nicolas
Matrix Cofactorization for Joint Unmixing and Classification of
Hyperspectral Images. (2019) In: 27th European Signal Processing
Conference (EUSIPCO 2019), 2 September 2019 - 6 September 2019
(A Coruna, Spain).

Abstract—This paper introduces a matrix cofactorization ap-
proach to perform spectral unmixing and classification jointly.
After formulating the unmixing and classification tasks as matrix
factorization problems, a link is introduced between the two
coding matrices, namely the abundance matrix and the feature
matrix. This coupling term can be interpreted as a clustering
term where the abundance vectors are clustered and the resulting
attribution vectors are then used as feature vectors. The overall
non-smooth, non-convex optimization problem is solved using a
proximal alternating linearized minimization algorithm (PALM)
ensuring convergence to a critical point. The quality of the
obtained results is finally assessed by comparison to other
conventional algorithms on semi-synthetic yet realistic dataset.
Index Terms—supervised learning, spectral unmixing, cofac-
torization, hyperspectral images.
I. I
NTRODUCTION
Following the fast increase of available remote sensing
images, many methods have been proposed to extract infor-
mation from such specific data. In particular classification
algorithms received a lot of attention from the scientific
community. The emergence of state-of-the-art algorithms such
as convolutional neural network [1] or random forest [2]
have brought unprecedented good results. In the so-called
supervised classification framework, these algorithms make it
possible to infer, from a reduced number of examples provided
by an expert, a classification rule. This rule is then used to
attribute to unknown pixels a class among a predefined set of
classes. Although very efficient, classification methods remain
a limited analysis of the image since it only attributes a single
class to each pixel when it is sometimes possible to extract
more information. In the specific case of hyperspectral images
(HSI), images capture a very rich signal since each pixel is
a sampling of the reflectance spectrum of the corresponding
area, typically in the visible and infrared spectral domains
with hundreds of measurements. To fully exploit the available
information, it is interesting to resort to alternative methods of
interpretation such as representation learning methods, namely
spectral unmixing in the case of HSI [3]. Spectral unmixing is
Part of this work has been supported Centre National d’
´
Etudes Spatiales
(CNES), Occitanie Region, EU FP7 through the ERANETMED JC-WATER
program, MapInvPlnt Project ANR-15-NMED-0002-02 and ANR-3IA Artifi-
cial and Natural Intelligence Toulouse Institute (ANITI).
a physic-based model which assumes that a given pixel, i.e. a
given measured spectrum, is the result of the combination of
a reduced number of elementary spectra called endmembers,
specific to a given material. The aim of unmixing methods is
to infer the proportion of each material present in the pixel.
The obtained abundance maps display the spatial distribution
of the material in the observed scene.
Even if classification and spectral unmixing are two widely-
used techniques, very few attempts have been made to com-
bine them. Most of these works [4], [5] intend to improve
classification results by using spectral unmixing to identify
mixed pixels and then process specifically the identified mixed
pixels. Instead of using the two methods sequentially, the
method proposed in this paper introduces the idea of a joint
unmixing and classification. This method is formulated as a
cofactorization problem, which is known to produce valuable
results in many application fields such as music source separa-
tion [6], and image analysis [7]. The core concept is to express
the two problems of interest, namely spectral unmixing and
classification, as factorization problems and then to introduce
a coupling term to intertwine the two estimations. Similarly
to [8], the coupling term is defined as a clustering term where
the abundance vectors provided by the unmixing step are
clustered and the resulting attribution vectors are then used as
feature vectors for the classification. The overall optimization
problem is non-convex non-smooth. Such problems are known
to be challenging to solve but, building on recent advances in
optimization, the PALM algorithm proposed in [9] is used as
an optimization scheme, thus guaranteeing convergence to a
critical point of the objective function.
The rest of this paper is organized as follows. Section II
defines the two factorization tasks and introduces the global
cofactorization problem. Then, the method used to minimize
the resulting criterion is presented in Section III. Finally,
the method is tested and compared to other unmixing and
classification methods in Section IV. Section V draws some
conclusions and perspectives.
II. P
ROBLEM STATEMENT
As presented in Sections II-A and II-B, spectral unmixing
and supervised classification are commonly expressed as fac-
torization problems. We propose to derive a unified framework
Matrix cofactorization for joint unmixing and
classification of hyperspectral images
Adrien Lagrange
, Mathieu Fauvel
, St
´
ephane May
, Jose
´
M. Bioucas-Dias
and Nicolas Dobigeon
IRIT/INP-ENSEEIHT, University of Toulouse, Toulouse, France
Centre d’
´
Etudes Spatiales de la BIOsph
`
ere (CESBIO), INRA, Toulouse, France
Centre National d’
´
Etudes Spatiales (CNES), DCT/SI/AP, Toulouse, France
Instituto de Telecomunicac¸
˜
oes, Instituto Superior Te
´
cnico, Universidade de Lisboa, 1049-001 Lisbon, Portugal
firstname.name@{enseeiht,inra,cnes,enseeiht}.fr, bioucas@lx.it.pt

by considering a global cofactorization problem. It relies on
a link between the two factorization problems in order to
perform a joint estimation. In the proposed model, the link
is made between the abundance matrix and the feature matrix.
More precisely, the coupling term is expressed as a clustering
term over the abundance vectors where the attribution vectors
to the clusters are also the feature vectors of the classification
as detailed in Section II-C.
A. Spectral unmixing
Each pixel of an HSI is a L-dimensional measurement of
a reflectance spectrum. Physics models this spectrum as a
combination of R elementary spectrum, gathered in the so-
called endmember matrix M R
L×R
, each characterizing a
specific material. The spectral unmixing task aims at retrieving
the so-called abundance vectors a
p
R
R
, with R L,
from the spectrum y
p
R
L
of the pth pixel (p P where
P ! {1, . . . , P } is the set of pixel indexes). These abundance
vectors describe the mixture contained in the pixel. Using
the conventional linear mixture model, the spectral unmixing
problem can be expressed as follow
min
M,A
1
2
&Y MA&
2
F
+ λ
a
&A&
1
+ ı
R
R×P
+
(A) (1)
where matrix Y R
L×P
gathers the P pixel spectra and A
R
R×P
the abundance vectors. In addition to the data fitting
term, two penalization terms are considered in the proposed
unmixing model. The term ı
R
R×P
+
(A) enforces a nonnegativity
constraint, ensuring an additive decomposition of the spectra.
The second penalization λ
a
&A&
1
is a sparsity penalization
promoting the concept that only a few endmembers are active
in a given pixel. In the following work, the choice has been
made to discard the estimation of the endmember matrix for
the sake of simplicity. The endmember matrix is assumed to
be known or estimated beforehand.
B. Classification
In the context of supervised classification, a subset of pixels
is available with their corresponding groundtruth. The index
subset of labeled pixel is denoted hereafter L while the index
subset of unlabeled pixel is U ( L U = and L U = P).
Classification intends to assign one of the C classes to each
pixel. In practice, classifying can be formulated as estimating
a C × P matrix C whose columns correspond to unknown
C-dimensional attribution vectors c
p
(p U). Each vector is
made of 0 except for c
i,p
= 1 when the pth pixel is assigned
the ith class. Numerous decision rules have been proposed
to carry out classification. Most of them rely on the use of
feature vectors z
p
R
K
(p P) associated with the P
pixels, gathered in the matrix Z R
K×P
. Considering a
linear classifier parametrized by the matrix Q R
C×K
, a
vector-wise nonlinear mapping φ(·), such as a sigmoid or a
softmax operator, is then applied to the output of the classifier.
Finally the classification rule can be expressed as the matrix
factorization problem
min
Q,C
U
J
c
(C, φ(QZ)) + ı
S
|U|
C
(C
U
) (2)
where J
c
(·, ·) is a cost function measuring the quality of the
estimated attribution vectors φ(Qz
p
) and and S
C
is the C-
dimensional probability simplex ensuring nonnegativity and
sum-to-one constraints of the attribution vectors. In this work,
the cost function J
c
(·, ·) has been chosen as the cross-entropy,
defined in a multi-class problem as
J
c
(C,
ˆ
C) =
!
p∈P
d
p
!
i∈C
c
i,p
log c
i,p
) (3)
with
d
p
=
"
1
|L
i
|
, if p L
i
,
1
|U|
, if p U,
(4)
where L
i
is the subset of labeled pixels belonging to class i,
ˆ
c
p
is the estimated attribution vector and c
p
the true one. The
weighing coefficients d
p
adjust the cost function with respect
to the sizes of the training and test sets, in particular in the
case of unbalanced classes. This particular loss function has
been extensively used in the context of neural networks [10].
Moreover, the nonlinear mapping φ(·) is chosen as a sigmoid,
which makes the proposed classifier interpretable as a one
layer neural network.
To consider a more elaborate case, it is also possible to add a
set of penalizations/constraints. In particular, a penalization of
the classifier parameters Q is considered to prevent an artificial
decrease of the loss function. This penalization is based on
a Frobenius-norm and is well-known in the neural network
community where it is referred to as weight decay. The second
considered penalization is a spatial regularization enforced
through a smoothed weighted vectorial total variation norm
(vTV). This regularization promotes a piece-wise constant
solution for the classification map C. The overall resulting
problem can be written
min
Q,C
U
!
p∈P
d
p
!
i∈C
c
i,p
log
#
1
1 + exp(q
i:
z
p
)
$
+ λ
q
&Q&
2
F
+ λ
c
&C&
vTV
+ ı
S
|U|
C
(C
U
) (5)
where q
i:
is the i-th line of Q, λ
q
and λ
c
weight the
regularization terms and
&C&
vTV
=
P
!
p=1
β
p
%
&
&
&
[
h
C]
p
&
&
&
2
+
&
&
&
[
v
C]
p
&
&
&
2
+ ǫ (6)
where ǫ > 0 is a smoothing parameter and [
h
(·)]
p
and
[
v
(·)]
p
denote horizontal and vertical discrete gradients
1
[
h
C]
(m,n)
= c
(m+1,n)
c
(m,n)
[
v
C]
(m,n)
= c
(m,n+1)
c
(m,n)
.
The weighting coefficients β
m,n
are introduced to account for
the natural boundaries present in the image. They are com-
puted beforehand using external data containing information
on the spatial structures, e.g., a panchromatic image or a
LIDAR image [11]. An example of such weights is described
in Section IV.
1
With a slight abuse of notations, c
(m,n)
refers to the pth column of C
where the pth pixel is spatially indexed by (m, n).

UNMIXING CLUSTERING CLASSIFICATION
Y
A
M
Image
Abund.
Endm.
min
Z
&A BZ&
2
F
C
L
C
U
Z
Q
Classification
Features
Classifier
Fig. 1. Structure of the cofactorization model. Variables in blue stand for
observations or available external data. Variables in olive green are linked
through the clustering term. The variable in a dotted box is assumed to be
known beforehand.
C. Clustering
T
o define a global cofactorization problem, a relation is
drawn between the activation matrices of the two factorization
problems, namely the abundance matrix and the feature matrix.
More specifically, following the idea developed in [8], a clus-
tering term is introduced as a coupling. Abundances vectors
are clustered and the resulting attribution vectors are then
used as feature vectors for the classification. Ideally, clustering
attribution vectors z
p
R
K
are filled with zeros except for
z
k,p
= 1 when a
p
is associated with the kth cluster. The well-
known k-means is chosen to perform this task since it is easily
expressed as an optimization problem
min
Z,B
1
2
&A BZ&
2
F
+ ı
S
P
K
(Z) + ı
R
R×K
+
(B) (7)
where columns of B R
R×K
stands for the centroids of
the K clusters. Two constraints are considered in this k-
means clustering problem: i) a positivity constraint on B since
centroids are expected to be interpretable as mean abundance
vectors and ii) the vectors z
p
(p P) are assumed to be
defined on the K-dimensional probability simplex S
K
. Thus,
the resulting clustering method is a particular instance of k-
means where the attribution vectors are relaxed and can be
interpreted as the collection of probabilities to belong to each
of the clusters.
D. Multi-objective problem
The two factorization problems corresponding to the spec-
tral unmixing and classification tasks have been expressed and
the link between these two problems has been set up through
the clustering term. The global cofactorization problem, illus-
trated in Figure 1, is finally formulated as
min
A,Q,Z
C
U
,B
λ
0
2
&Y MA&
2
F
+ λ
a
&A&
1
+ ı
R
R×P
+
(A)
λ
1
2
!
p∈P
d
p
!
i∈C
c
i,p
log
#
1
1 + exp(q
i:
z
p
)
$
+
λ
q
2
&Q&
2
F
+ λ
c
&C&
vTV
+ ı
S
|U|
C
(C
U
)
+
λ
2
2
&A BZ&
2
F
+ ı
S
P
K
(Z) + ı
R
R×K
+
(B) (8)
where λ
0
, λ
1
and λ
2
are introduced to weight the contribution
of the various terms.
III. O
PTIMIZATION SCHEME
The proposed global optimization problem (8) is non-
con
vex and non-smooth. Such problem are usually very chal-
lenging to solve. To handle it, we propose to resort to the
PALM algorithm proposed in [9]. PALM algorithm ensures
convergence to a critical point, i.e., a local minimum of the ob-
jective function. To apply PALM, the objective is rewritten as
a sum of independent non-smooth terms f
j
(·) (j {1, . . . , 3})
and a smooth coupling term g(·)
min
A,B,Z,
Q,C
U
f
0
(A)+f
1
(B)+f
2
(Z)+f
3
(C
U
)+g(A, B, Z, C
U
, Q)
where
f
0
(A) = ı
R
+
(A) + λ
a
&A&
1
, f
1
(B) = ı
R
+
(B)
f
2
(Z) = ı
S
P
K
(Z), f
3
(C
U
) = ı
S
|U|
K
(C
U
)
g(A, B, Z, C
U
, Q) =
λ
0
2
&Y MA&
2
F
λ
1
2
!
p∈P
d
p
!
i∈C
c
i,p
log
#
1
1 + exp(q
i:
z
p
)
$
+
λ
2
2
&A BZ&
2
F
+
λ
q
2
&Q&
2
F
+ λ
c
&C&
vTV
.
Algorithm 1: PALM
1 Initialize variables A
0
, B
0
, Z
0
, C
U
0
and Q
0
;
2 Set α > 1;
3 while stopping criterion not reached do
4 A
k+1
prox
αL
A
f
0
(A
k
1
αL
A
A
g(A
k
, B
k
, Z
k
, C
k
U
, Q
k
));
5 B
k+1
prox
αL
B
f
1
(B
k
1
αL
B
B
g(A
k+1
, B
k
, Z
k
, C
k
U
, Q
k
));
6 Z
k+1
prox
αL
Z
f
2
(Z
k
1
αL
Z
Z
g(A
k+1
, B
k+1
, Z
k
, C
k
U
, Q
k
));
7 Q
k+1
prox
αL
Q
f
3
(Q
k
1
αL
Q
Q
g(A
k+1
, B
k+1
, Z
k+1
, C
U
k
, Q
k
));
8 C
k+1
U
prox
αL
C
U
f
4
(C
k
U
1
αL
C
U
C
U
g(A
k+1
, B
k+1
, Z
k+1
, C
k
U
, Q
k+1
));
9 end
10 return A
end
, B
end
, Z
end
, Q
end
, C
end
U
The concept of this algorithm is to perform a proximal
gradient descent according to each variable alternatively. To
apply PALM, the functions f
j
(·) have to be proper, lower
semi-continuous, extended real-valued. A sufficient condition
on the function g(·) is to be C
2
, i.e., with continuous first
and second derivatives, and its partial gradients have to be
globally Lipschitz. L
X
denotes herein the Lipschitz constant
associated to the partial gradient according to X. The detailed
steps of the algorithm are summarized in Algorithm 1 and
further theoretical details are available in [9].
In practice, one needs to be able to compute the partial
gradient and its associated Lipschitz constant to perform the
gradient descent. It is also necessary to compute the proximal
operator associated to the non-smooth terms. In the present
case, the partial gradients is easily computed and all globally
Lipschitz. The only problematic term is the vTV term which
is not globally Lipschitz in its canonical form. To alleviate,

a smoothed counterpart has been introduced in (6) with a
smoothing parameter ǫ R
+
. As for the proximal operators,
they are are well-known [12] except for f
0
(·). For f
0
(·), it
is necessary to resort to the composition of the proximal
operators associated to the non-negative constraint and the
1
-
norm, which is here possible according to [13].
IV. EXPERIMENTS
Data generation The
HSI used to perform the experiments is
a semi-synthetic image. More specifically, the image has been
generated using a real HSI. The real image has been unmixed
using a fully constrained least square (FCLS) algorithm [14]
using R = 5 endmembers extracted with the well-known VCA
algorithm [15]. The obtained abundance maps have then been
used to generate a new synthetic image using pure spectra
from the hyperspectral library ASTER [16]. The groundtruth
of the original data, composed of C = 3 classes has been
preserved to assess the quality of the classification. A color
composition, a panchromatic version and the groundtruth are
presented in Figure 2. The subset of the image used as training
data is as also shown in Figure 2.
(a) (b) (c) (d)
Fig. 2. Synthetic image: (a) colored composition of the HSI Y, (b)
panchromatic image y
PAN
, (c) classification ground-truth, (d) training set.
Initialization and convergence As
stated before, cofac-
torization is a non-convex problem and PALM only ensures
convergence to a local minimum of the objective function.
It is thus important to carefully initialize the estimated vari-
ables in order to reach a relevant solution. In the presented
experiment, abundance matrix A
0
has been initialized by
solving min
AR
R×P
+
&Y MA&
2
F
using a projected gradient
algorithm. Then, a k-means algorithm has been applied to the
obtained abundance vectors and the resulting centroids and
attribution vectors have been used to initialize B
0
and Z
0
.
On the other hand, classifier parameters Q
0
and classification
matrix C
0
U
have been initialized randomly.
In order to assess the convergence of the optimization
scheme, the normalized difference between two consecutive
values of the objective function is monitored. When this
value reach a certain threshold (10
4
for this experiment), the
optimization process stops and the last estimation is assumed
to be close enough to the solution.
Hyperparameters Multiple hyperparameters λ
·
have been
introduced in problem (8) to weight the various terms of
the objective function. For practical use, these parameters
have been normalized by the size and dynamics of the cor-
responding variables. These normalized parameters, denoted
TABLE I
U
N
MIXING AND CLASSIFICATION RESULTS.
Model Kappa F1-mean RMSE(
ˆ
A) RE Time (s)
RF 0.817 0.842 N\A N\A 0.4
FCLS N\A N\A 0.0701 0.224 1.2
CBPDN N\A N\A 0.0792 0.229 2
D-KSVD 0.494 0.554 N\A 0.923 70
Cofact. 0.847 0.870 0.0504 0.750 180
Cofact. + vTV 0.874 0.895 0.0526 0.752 81
˜
λ
·
, have been empirically tuned to obtain consistent results
(
˜
λ
0
=
˜
λ
1
=
˜
λ
2
= 1,
˜
λ
a
= 10
3
,
˜
λ
q
= 0.15). For the last
hyperparameter
˜
λ
c
, two values have been considered 0. and
0.1, standing respectively for the case without and with spatial
regularization. The definition of the vTV regularization also
includes parameters which has to be properly set. First, the
smoothing parameter is set to ǫ = 0.01 to ensure the gradient-
Lipschitz property without modifying substantially the TV-
norm. Secondly, it is necessary to define the weighing coeffi-
cients β
m,n
. They have been computed from a panchromatic
image y
PAN
, shown in Figure 2, generated by normalizing
hyperspectral bands by their mean and then summing them.
More precisely, to account for possible homogeneous areas in
the image, they are defined as follows
β
p
=
˜
β
p
'
q
˜
β
q
with
˜
β
q
=
(
&
&
&
[y
PAN
]
q
&
&
&
2
+ σ
)
1
where σ = 0.01 controls the variation of the weights and
avoids numerical issues.
Compared methods To assess the quality of the unmix-
ing and classification results, the proposed method has been
compared to several well-known unmixing and classification
algorithms. Regarding classification, we considered the ran-
dom forest (RF) algorithm, known to perform very well to
classify HSI. Parameters of the RF (number of trees, depth)
have been adjusted using gridsearch and cross-validation. The
discriminative K-SVD (D-KSVD) method has been used as a
benchmark [17]. This model is also a cofactorization method
but with a simpler approach where the two coding matrices A
and Z are imposed to be equal. In this case, the first term is not
a spectral unmixing task but rather a dictionary learning task
where dictionary elements are assumed to be discriminative
for the classification task. Only a sparsity penalization is
considered for D-KSVD using a
0
-norm.
As for the unmixing comparison, we considered two meth-
ods described in [14]. The first method is the fully constrained
least square method (FCLS) where the corresponding opti-
mization problem is defined as the data fitting term with a
positivity and sum-to-one constraint on abundance vectors a
p
.
The second method is the constrained basis pursuit denoising
(CBPDN) corresponding to problem 1. The hyperparameter
λ
a
, weighting the sparsity penalty is also adjusted using
gridsearch and cross-validation. It should be noted that all
unmixing methods use directly the correct endmember matrix
M which has been used to generate the data. Additionally, the
endmember matrix is used to initialize the dictionary of the
D-KSVD method.

References
More filters
Proceedings ArticleDOI
14 Jun 2010
TL;DR: Two new algorithms to efficiently solve convex optimization problems, based on the alternating direction method of multipliers, a method from the augmented Lagrangian family, are introduced and are shown to outperform off-the-shelf methods in terms of speed and accuracy.
Abstract: Convex optimization problems are common in hyperspectral unmixing. Examples are the constrained least squares (CLS) problem used to compute the fractional abundances in a linear mixture of known spectra, the constrained basis pursuit (CBP) to find sparse (i.e., with a small number of terms) linear mixtures of spectra, selected from large libraries, and the constrained basis pursuit denoising (CBPDN), which is a generalization of BP to admit modeling errors. In this paper, we introduce two new algorithms to efficiently solve these optimization problems, based on the alternating direction method of multipliers, a method from the augmented Lagrangian family. The algorithms are termed SUnSAL (sparse unmixing by variable splitting and augmented Lagrangian) and C-SUnSAL (constrained SUnSAL). C-SUnSAL solves the CBP and CBPDN problems, while SUnSAL solves CLS as well as a more general version thereof, called constrained sparse regression (CSR). C-SUnSAL and SUnSAL are shown to outperform off-the-shelf methods in terms of speed and accuracy.

580 citations


"Matrix Cofactorization for Joint Un..." refers methods in this paper

  • ...The real image has been unmixed using a fully constrained least square (FCLS) algorithm [14] using R = 5 endmembers extracted with the well-known VCA algorithm [15]....

    [...]

  • ...As for the unmixing comparison, we considered two methods described in [14]....

    [...]

Journal ArticleDOI
TL;DR: A new algorithm is proposed to project, exactly and in finite time, a vector of arbitrary size onto a simplex or an $$l_1$$l1-norm ball as a Gauss–Seidel-like variant of Michelot’s variable fixing algorithm.
Abstract: A new algorithm is proposed to project, exactly and in finite time, a vector of arbitrary size onto a simplex or an $$l_1$$l1-norm ball. It can be viewed as a Gauss---Seidel-like variant of Michelot's variable fixing algorithm; that is, the threshold used to fix the variables is updated after each element is read, instead of waiting for a full reading pass over the list of non-fixed elements. This algorithm is empirically demonstrated to be faster than existing methods.

253 citations


"Matrix Cofactorization for Joint Un..." refers background in this paper

  • ...As for the proximal operators, they are are well-known [12] except for f0(·)....

    [...]

Journal ArticleDOI
TL;DR: A method to address the problem of mixed pixels and to obtain a finer spatial resolution of the land cover classification maps is proposed, which exploits the advantages of both soft classification techniques and spectral unmixing algorithms, in order to determine the fractional abundances of the classes at a sub-pixel scale.
Abstract: The problem of classification of hyperspectral images containing mixed pixels is addressed. Hyperspectral imaging is a continuously growing area of remote sensing applications. The wide spectral range of such imagery, providing a very high spectral resolution, allows to detect and classify surfaces and chemical elements of the observed image. The main problem of hyperspectral data is the (relatively) low spatial resolution, which can vary from a few to tens of meters. Many factors make the spatial resolution one of the most expensive and hardest to improve in imaging systems. For classification, the major problem caused by low spatial resolution are the mixed pixels, i.e., parts of the image where more than one land cover map lie in the same pixel. In this paper, we propose a method to address the problem of mixed pixels and to obtain a finer spatial resolution of the land cover classification maps. The method exploits the advantages of both soft classification techniques and spectral unmixing algorithms, in order to determine the fractional abundances of the classes at a sub-pixel scale. Spatial regularization by simulated annealing is finally performed to spatially locate the obtained classes. Experiments carried out on synthetic real data sets show excellent results both from a qualitative and quantitative point of view.

164 citations

Proceedings Article
05 Dec 2013
TL;DR: This paper initiates a systematic investigation of when the proximal map of a sum of functions decomposes into the composition of the proxies of the individual summands, and unifies a few known results scattered in the literature and discovers several new decompositions obtained almost effortlessly from the theory.
Abstract: The proximal map is the key step in gradient-type algorithms, which have become prevalent in large-scale high-dimensional problems. For simple functions this proximal map is available in closed-form while for more complicated functions it can become highly nontrivial. Motivated by the need of combining regularizers to simultaneously induce different types of structures, this paper initiates a systematic investigation of when the proximal map of a sum of functions decomposes into the composition of the proximal maps of the individual summands. We not only unify a few known results scattered in the literature but also discover several new decompositions obtained almost effortlessly from our theory.

99 citations


"Matrix Cofactorization for Joint Un..." refers background in this paper

  • ...For f0(·), it is necessary to resort to the composition of the proximal operators associated to the non-negative constraint and the 1norm, which is here possible according to [13]....

    [...]

Journal ArticleDOI
TL;DR: The results show that the proposed framework can provide better abundance estimates and, more specifically, can significantly improve the abundance estimates for the pixels affected by shadows.
Abstract: Spectral unmixing (SU) methods incorporating the spatial regularizations have demonstrated increasing interest. Although spatial regularizers that promote smoothness of the abundance maps have been widely used, they may overly smooth these maps and, in particular, may not preserve edges present in the hyperspectral image. Existing unmixing methods usually ignore these edge structures or use edge information derived from the hyperspectral image itself. However, this information may be affected by the large amounts of noise or variations in illumination, leading to erroneous spatial information incorporated into the unmixing procedure. This paper proposes a simple yet powerful SU framework that incorporates external data [i.e. light detection and ranging (LiDAR) data]. The LiDAR measurements can be easily exploited to adjust the standard spatial regularizations applied to the unmixing process. The proposed framework is rigorously evaluated using two simulated data sets and a real hyperspectral image. It is compared with methods that rely on spatial information derived from a hyperspectral image. The results show that the proposed framework can provide better abundance estimates and, more specifically, can significantly improve the abundance estimates for the pixels affected by shadows.

26 citations

Frequently Asked Questions (13)
Q1. What are the contributions in this paper?

This paper introduces a matrix cofactorization approach to perform spectral unmixing and classification jointly. After formulating the unmixing and classification tasks as matrix factorization problems, a link is introduced between the two coding matrices, namely the abundance matrix and the feature matrix. 

M. Belgiu and L. Drăguţ, “ Random forest in remote sensing: A review of applications and future directions, ” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 114, pp. 24–31, 2016. [ 3 ] 

Considering a linear classifier parametrized by the matrix Q ∈ RC×K , a vector-wise nonlinear mapping φ(·), such as a sigmoid or a softmax operator, is then applied to the output of the classifier. 

To evaluate the classification accuracy, two conventional metrics are used, namely Cohen’s kappa coefficient and the averaged F1-score over all classes [18]. 

More precisely, the coupling term is expressed as a clustering term over the abundance vectors where the attribution vectors to the clusters are also the feature vectors of the classification as detailed in Section II-C. 

Two constraints are considered in this kmeans clustering problem: i) a positivity constraint on B since centroids are expected to be interpretable as mean abundance vectors and ii) the vectors zp (p ∈ P) are assumed to be defined on the K-dimensional probability simplex SK . 

It should be noted that all unmixing methods use directly the correct endmember matrix M which has been used to generate the data. 

This paper introduces a unified framework to perform jointly spectral unmixing and classification by the mean of a cofactorization problem. 

The index subset of labeled pixel is denoted hereafter L while the index subset of unlabeled pixel is U ( L ∩ U = ∅ and L ∪ U = P). 

To evaluate the unmixing results quantitatively, the reconstruction error (RE) and root global mean squared error (RMSE) are considered, i.e.,RE =√1PL∥ ∥ ∥ Y −MÂ ∥ ∥ ∥ 2F ,RMSE =√1PR∥ ∥ ∥ Atrue − 

To define a global cofactorization problem, a relation is drawn between the activation matrices of the two factorization problems, namely the abundance matrix and the feature matrix. 

The weighing coefficients dp adjust the cost function with respect to the sizes of the training and test sets, in particular in the case of unbalanced classes. 

Results reported in Table The authorshow that the proposed cofactorization framework outperforms both RF and D-KSVD in term of classification.