How many values of k[proj+CC] were found?

For k[proj+CC], based on Eqn. (18), the mixing coefficient γ[proj] was fixed at 1, while the optimal value of γ[CC] was found by scanning through a range of values.

What is the simplest way to get a Grassmannian mapping?

The proposed algorithm uses the points on the Grassmannian manifold implicitly (ie., via measuring similarities through a kernel) to obtain a mapping, A = [A1|A2| · · · |Ar] that maximises a quotient similar to discriminant analysis, while retaining the overall geometrical structure.

What is the simplest way to solve the Grassmannian kernel problem?

In general, the authors can express a linear combination of two Grassmannian kernels k[A] and k[B] as:k[A+B] = γ[A]k[A] + γ[B]k[B] (18)where γ[A], γ[B] ≥ 0.

What is the size of the orthonormal matrices?

Points on a Grassmannian manifold, GD,m, can be viewed as the set of m-dimensional subspaces of RD and are represented by orthonormal matrices, each with a size of D ×m.

How can the authors get richer descriptions on Grassmannian manifolds?

More precisely, by clustering a set of images into several subsets and considering each subset as a point on a Grassmannian manifold, richer descriptions on Grassmannian manifolds might be attained.

(Open Access) Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching (2011) | Mehrtash Harandi

Q: What have the authors contributed in "Graph embedding discriminant analysis on grassmannian manifolds for improved image set matching author" ?

The authors propose a discriminant analysis approach on Grassmannian manifolds, based on a graphembedding framework. The authors show that by introducing withinclass and between-class similarity graphs to characterise intra-class compactness and inter-class separability, the geometrical structure of data can be exploited. The authors further propose a Grassmannian kernel, based on canonical correlation between subspaces, which can increase discrimination accuracy when used in combination with previous Grassmannian kernels.

Q: What have the authors stated for future works in "Graph embedding discriminant analysis on grassmannian manifolds for improved image set matching author" ?

Future avenues of research include exploring subset generation prior to Grassmannian analysis.

Graph embedding discriminant analysis on Grassmannian

manifolds for improved image set matching

Author

Harandi, Mehrtash T, Sanderson, Conrad, Shirazi, Sareh, Lovell, Brian C

Published

2011

Conference Title

CVPR 2011

Version

Accepted Manuscript (AM)

DOI

https://doi.org/10.1109/cvpr.2011.5995564

obtained for all other uses, in any current or future media, including reprinting/republishing this

material for advertising or promotional purposes, creating new collective works, for resale or

redistribution to servers or lists, or reuse of any copyrighted component of this work in other

works.

Downloaded from

http://hdl.handle.net/10072/401034

Griffith Research Online

https://research-repository.griffith.edu.au

Graph Embedding Discriminant Analysis on Grassmannian Manifolds

for Improved Image Set Matching

Mehrtash T. Harandi, Conrad Sanderson, Sareh Shirazi, Brian C. Lovell

NICTA, PO Box 6020, St Lucia, QLD 4067, Australia

The University of Queensland, School of ITEE, QLD 4072, Australia

Abstract

A convenient way of dealing with image sets is to represent

them as points on Grassmannian manifolds. While several

recent studies explored the applicability of discriminant

analysis on such manifolds, the conventional formalism of

discriminant analysis suffers from not considering the local

structure of the data. We propose a discriminant analysis

approach on Grassmannian manifolds, based on a graph-

embedding framework. We show that by introducing within-

class and between-class similarity graphs to characterise

intra-class compactness and inter-class separability, the ge-

ometrical structure of data can be exploited. Experiments

on several image datasets (PIE, BANCA, MoBo, ETH-80)

show that the proposed algorithm obtains considerable im-

provements in discrimination accuracy, in comparison to

three recent methods: Grassmann Discriminant Analysis

(GDA), Kernel GDA, and the kernel version of Afﬁne Hull

Image Set Distance. We further propose a Grassmannian

kernel, based on canonical correlation between subspaces,

which can increase discrimination accuracy when used in

combination with previous Grassmannian kernels.

1. Introduction

In contrast to object recognition approaches based on

considering one image at a time, there has been a recent

surge of interest in techniques based on explicit image set

matching [9, 16, 25, 26]. This is mainly driven by the need

for superior discrimination accuracy as well as increased

robustness to practical issues such as pose variations, mis-

alignment and varying environmental conditions (for exam-

ple, as present in realistic face recognition scenarios [21]).

While image set matching can be accomplished through

probability-density based methods [3, 8] and aggregation

methods [17], it has been shown that better performance can

be attained through modelling image sets via linear struc-

tures (ie., subspaces) [25, 29]. Subspaces appear to be ap-

∗

Acknowledgements: NICTA is funded by the Australian Government

as represented by the Department of Broadband, Communications and the

Digital Economy, as well as the Australian Research Council through the

ICT Centre of Excellence program. The second and third authors con-

tributed equally. We thank Prof. Terry Caelli for useful discussions.

propriate models for this task since they are able to accom-

modate the effects of various image variations. For exam-

ple, an acceptable and widely used approximation for pho-

tometric invariance, under conditions of no shadowing and

Lambertian reﬂectance, is a 4 dimensional linear space [1].

A convenient way of dealing with subspaces is

to represent them as points on Grassmannian mani-

folds [11, 13, 19, 25]. Recently, several studies explored the

applicability of discriminant analysis (DA) on such man-

ifolds [13, 26]. Given subspaces that are represented as

points on a Grassmannian manifold M, the underlying idea

is to map them to another Grassmannian manifold M

′

, such

that a measure of discriminatory power on M

′

is maximised

(see Fig. 1 for a conceptual example).

While the approaches presented in [13, 26] show promis-

ing results, the conventional formalism of DA suffers from

not being able to take into account the local structure of

data [10, 15]. For example, outliers and multi-modal classes

can adversely affect the discrimination and/or generalisa-

tion ability of models based on conventional DA.

Motivated by advances in DA over Euclidean vector

spaces [30, 24], we propose a novel DA on Grassmannian

manifolds, based on a graph-embedding framework [30].

We show that considerable gains in discrimination accu-

racy can be obtained by exploiting the geometrical structure

and local information on Grassmannian manifolds. This

is achieved by introducing within-class and between-class

similarity graphs to characterise intra-class compactness

and inter-class separability, respectively.

The proposed method for DA on Grassmannian mani-

folds is somewhat related to distance metric learning meth-

ods [28]. The main points of difference include the use of

graphs and manifolds in contrast to the typical use of vector

spaces in distance metric learning. Overall, the proposed

method can be considered as an extension of both graph-

embedding and distance metric learning to higher order data

structures.

We also propose a new kernel, based on canonical cor-

relation between subspaces, for measuring the similarity of

two points on a Grassmannian manifold. We empirically

show that, in combination with previous Grassmannian ker-

nels, the new kernel can result in considerable discrimina-

tion accuracy improvements.

ℝ









 

(a) (b) (c)

Figure 1. A conceptual illustration of the proposed approach. (a) Image-sets can be described in R

by linear subspaces. To compare two

linear subspaces, the principal angles between them can be used. For clarity just two subspaces are shown. (b) Linear subspaces in R

can be represented as points on the Grassmannian manifold M. Having a proper geodesic distance between the points on the manifold,

it is possible to convert the image-set matching problem into a point to point classiﬁcation problem. (c) By having a Grassmannian

kernel in hand, points on the Grassmannian manifold can be mapped into another Grassmannian manifold where not only certain local

properties have been retained but also the discriminatory power between classes has been increased. Unlike the conventional formalism

of discriminant analysis, the proposed method preserves the geometrical structure and local information on Grassmannian manifolds by

exploiting within-class and between-class similarity graphs.

We continue the paper as follows. Section 2 provides an

overview of Grassmannian analysis, which leads to the pro-

posed graph embedding discriminant analysis in Section 3.

We introduce the Grassmannian canonical correlation ker-

nel in Section 4. In Section 5 we brieﬂy describe the overall

computational complexity of the proposed method. In Sec-

tion 6 we compare the performance of the proposed method

and kernel with previous approaches on several object and

face datasets. The main ﬁndings and possible future direc-

tions are summarised in Section 7.

2. Grassmannian Analysis

Manifold analysis has been extensively considered with

success by various disciplines. Amari and Nagaoka state

that many important structures in information theory and

statistics can be treated as structures in differential geom-

etry by regarding a space of probabilities as a Riemannian

manifold [2]. A manifold is a topological space that is lo-

cally similar to Euclidean space. At an intuitive level, man-

ifolds can be thought of as smooth, curved surfaces embed-

ded in higher dimensional Euclidean spaces. Riemannian

manifolds are endowed with a distance measure which al-

lows us to measure how similar two points are. In this work

we are interested in a particular class of Riemannian mani-

folds, known as Grassmannian manifolds [11].

Points on a Grassmannian manifold, G

D,m

, can be

viewed as the set of m-dimensional subspaces of R

and

are represented by orthonormal matrices, each with a size of

D × m. Two points on a Grassmannian manifold are equiv-

alent if one can be mapped into the other one by a m × m

orthogonal matrix [11].

Grassmannian analysis provides a natural way to tackle

the problem of image set matching. Speciﬁcally, as G

D,m

is the manifold parameterising m-dimensional real vector

subspaces of the D-dimensional vector space R

, the clas-

siﬁcation problem of matching sets comprising m images,

where each image is described by D pixels, can be trans-

formed to a point classiﬁcation problem on G

D,m

During the past decade the concept of angles between

subspaces, ie., principal angles has been widely used for im-

age set matching [29]. Since Grassmannian manifolds are

curved and the shortest distance between points is geodesic,

it is not surprising to see that distances over Grassmannian

manifolds may outperform methods based on principal an-

gles. We note that principal angles can be considered as a

simple form of geodesic distance on Grassmannian mani-

folds [19].

Grassmannian kernels [13, 14, 27] allow us to treat the

Grassmannian space as if it were a Euclidean vector space.

As a result, learning algorithms in vector spaces can be ex-

tended to their counterparts on Grassmannian manifolds,

eg., kernel discriminant analysis [13, 26]. In the following

section we will demonstrate how Grassmannian kernels can

be employed to map points on a Grassmannian manifold

onto another Grassmannian manifold, where a measure of

discriminatory power between classes has been maximised.

3. Graph Embedding Discriminant Analysis

Linear Discriminant Analysis (LDA) is a supervised sta-

tistical learning method that seeks a linear projection by si-

multaneously maximising the between-class dissimilarities

and minimising the within-class dissimilarities [6]. While

LDA has been successfully applied to various computer

vision problems, eg., face recognition [5], it suffers from

not being able to naturally capture the local structure of

data [10, 24]. For example, LDA has problems handling

multi-modal classes (where each class is comprised of sev-

eral separate clusters) or when there are outliers in the data.

This stems from treating all data points in the same manner

(during the calculation of within-class and between-class

scatter matrices), no matter how they are related to their

classes.

To alleviate the above problem, a graph-embedding

framework can be used [7, 24, 30]. A graph (V , W ) in our

context refers to a collection of vertices or nodes, V , and a

collection of edges that connect pairs of vertices. We note

that W is a symmetric matrix with elements describing the

similarity between pairs of vertices. Moreover, the diagonal

matrix D and the Laplacian matrix L of a graph are deﬁned

as L = D − W , with the diagonal elements of D obtained

as D(i, i) =

j6=i

W (i, j).

Given a graph in a vector space, the purpose of graph-

embedding DA is to maximise a measure of discriminatory

power by mapping the underlying data into another vec-

tor space (usually with lower dimensionality) while pre-

serving similarities between vertex pairs. This problem

can be solved through a generalised eigen-analysis frame-

work [30]. In the following text, we formulate the discrim-

inant analysis over Grassmannian manifolds based on the

graph-embedding framework.

Given N labelled points X = {(X

, l

)}

i=1

from the un-

derlying Grassmannian manifold M, where X

∈ R

D×m

and l

∈ {1, 2, · · · , C}, with C denoting the number of

classes, the local geometrical structure of M can be mod-

elled by building a within-class similarity graph W

and

a between-class similarity graph W

. The simplest forms

of W

and W

are based on the nearest neighbour graphs

deﬁned in Eqns. (1) and (2):

(i, j) =



1, if X

∈ N

) or X

∈ N

)

0, otherwise

(1)

(i, j) =



1, if X

∈ N

) or X

∈ N

)

0, otherwise

(2)

In Eqn. (1), N

) is the set of v neighbours

, X

, ..., X

, sharing the same label as l

. Similarly

in Eqn. (2), N

) contains v neighbours having different

labels. We note that more complex similarity graphs, like

heat kernel graphs, can also be used to encode distances be-

tween points on Grassmannian manifolds [20].

Our aim is to maximise discriminatory power while si-

multaneously preserving geometry, by mapping the points

on M to a new manifold M

′

, ie., α : X

→ Y

. A suitable

transform would place the connected points of W

as close

as possible, while moving the connected points of W

as far

as possible. Such a mapping can be described by optimising

the following two objective functions:

= min

i,j

− Y

)

(i, j) (3)

= max

i,j

− Y

)

(i, j) (4)

Eqn. (3) punishes neighbours in the same class if they are

mapped far away in M

′

, while Eqn. (4) punishes points of

different classes if they are mapped close together in M

′

Assume that points on the manifold are implicitly known

and only a measure of similarity between them is available

through a Grassmannian kernel

, k

= hX

, X

Conﬁning the solution to be linear, ie.,

j=1

, we will have:

= (hα

, X

i , hα

, X

i , · · · , hα

, X

(5)

By deﬁning A

=(a

, a

, · · ·, a

)

and K

=(k

, k

, · · ·, k

)

it can be shown that hα

, X

i = A

. Hence Eqn. (3) can

be simpliﬁed to:

i,j

− Y

)

(i, j)

i,j

− A

(i, j)

(i, i)

−

i,j

(i, j)

= A

A − A

(6)

where A = [A

| · · · |A

] and K = [K

| · · · |K

Considering that L

= D

− W

, in a similar manner it can

be shown that Eqn. (4) can be simpliﬁed to:

i,j

− Y

)

(i, j)

= A

A − A

= A

(7)

Following [7, 30], a constraint is imposed on Eqn. (3) and

the minimisation problem is converted to a maximisation

one. Speciﬁcally, by forcing A

A to be a constant

such as 1, Eqn. (3) becomes the following maximisation

problem:

min

A − A

= min

1 − A

= max

(8)

subject to

A = 1 (9)

By converting both problems into maximisation, the overall

optimisation problem is hence:

max

K(L

+ βW

subject to A

A = 1

(10)

where β is a Lagrangian multiplier that acts as a regulari-

sation parameter in the ﬁnal solution. The solution of (10)

can be found through the following generalised eigenvalue

problem:

K {L

+ βW

} K

A = λKD

A (11)

More speciﬁcally, the desired projection matrix A, is equal

to the r largest eigenvectors of the Rayleigh quotient:

K {L

+ βW

} K

(12)

We use the notation hX

, X

i to indicate a similarity measure

between points X

and X

on a Grassmannian manifold. This is similar

in principle to an inner product in Hilbert space, as used in kernel-based

methods [22].

Fig. 2 outlines the proposed graph embedding method on

Grassmannian manifolds. The proposed algorithm uses the

points on the Grassmannian manifold implicitly (ie., via

measuring similarities through a kernel) to obtain a map-

ping, A = [A

| · · · |A

] that maximises a quotient similar

to discriminant analysis, while retaining the overall geomet-

rical structure.

Upon acquiring the mapping A, the matching problem

over Grassmannian manifolds is reduced to classiﬁcation

in vector spaces. More precisely, for any query image

set X

, a vector representation using the kernel function

and the mapping A is acquired, ie., V

= A

, where

= (hX

, X

i , hX

, X

i , · · · , hX

, X

. Similarly,

gallery points X

are represented by r dimensional vectors

= A

and classiﬁcation methods such as Nearest-

Neighbour or Support Vector Machines [6] can be em-

ployed to label X

4. Grassmannian Kernels

The similarity between two points on a Grassmannian

manifold, eg., X

and X

∈ R

D×m

, can be measured using

kernels such as the projection kernel:

[proj]

i,j

‚

(13)

One of the ﬁrst attempts to solve the problem of image set

matching was based on the notion of principal angles. More

precisely, Yamaguchi et al. [29] used the largest canonical

correlation value (the cosine of principal angles) to mea-

sure the similarity between two image sets. In Section 4.1

we show that the largest canonical correlation between sub-

spaces is a kernel on Grassmannian manifolds. We then

show in Section 4.2 that a more complex kernel, created

through linearly combining existing Grassmannian kernels,

is also a Grassmannian kernel.

We will later demonstrate that combining the projection

kernel with the proposed canonical correlation kernel can

lead to considerable improvements in discrimination accu-

racy, in the context of the proposed graph-embedding dis-

criminant analysis.

4.1. Canonical Correlation Kernel

Given subspaces X

and X

, we deﬁne the canonical

correlation kernel as:

[CC]

i,j

= max

∈span(X

)

max

∈span(X

)

(14)

subject to a

= b

= 1 and a

= b

= 0, p 6= q.

For k

[CC]

to be a Grassmannian kernel [14], it must be

(i) positive deﬁnite, and (ii) well deﬁned, meaning it

is invariant to various representations of the subspaces,

ie., k(X

, X

) = k(X

, X

), ∀ R

, R

∈ Q(m),

where Q(m) indicates orthonormal matrices of order m.

Since the singular values of X

are equal to

, the canonical correlation kernel is well-

deﬁned. To show that the kernel matrix [K]

= k

[CC]

i,j

is pos-

itive deﬁnite, it sufﬁces to show that z

Kz > 0 for ∀z ∈ R

Kz =

[CC]

1,1

[CC]

1,2

. . . k

[CC]

1,n

[CC]

2,1

[CC]

2,2

. . . k

[CC]

2,n

[CC]

n,1

[CC]

n,2

. . . k

[CC]

n,n

= z

[CC]

1,1

+ z

[CC]

2,2

+ . . . + z

[CC]

n,n

“

[CC]

1,2

+ z

[CC]

1,3

+ . . . + z

[CC]

1,n

”

“

[CC]

2,3

+ z

[CC]

2,4

+ . . . + z

[CC]

2,n

”

+ . . . + 2z

n−1

[CC]

n−1,n

(15)

In Eqn. (15) we have used the fact that k

[CC]

i,j

= k

[CC]

j,i

. Since

the principal angle between X

to itself is zero, k

[CC]

i,i

= 1.

Hence Eqn. (15) can be further simpliﬁed to:

Kz =

i=1

− 2

i=1

j6=i

+ 2

i=1

j6=i

[CC]

i,j

i=1

+ 2

i=1

j6=i

“

[CC]

i,j

− 1

”

(16)

Note that min

“

[CC]

i,j

-1

””

= − z

, since k

[CC]

i,j

∈ [0, 1].

Consequently:

min

“

”

i=1

− 2

i=1

j6=i

(17)

As the right-hand side of Eqn. (17) is always positive for

6= 0, K is a positive-deﬁnite matrix.

Input:

• Training set X = {(X

, l

)}

i=1

from the underlying Grass-

mannian manifold, where X

∈ R

D×m

is a subspace

(obtained for example via SVD over an image-set) and

∈ {1, 2, · · · , C}, with C denoting the number of classes

• A kernel function k

, for measuring the similarity between two

points on the Grassmannian manifold

Processing:

1. Compute the Gram matrix [K]

for all X

, X

2. Compute the within-class and between-class graph similarity

matrices, W

, W

, respectively, the between Laplacian ma-

trix L

and the diagonal within matrix D

3. To obtain A, solve the maximisation problem in Eqn. (11) by

eigen decomposition; A is equal to the r largest eigenvectors

of the Rayleigh quotient

K{L

+βW

Output:

• The projection matrix A = [A

| · · · |A

], where each A

is an eigenvector found in step 3 above; the eigenvectors are

sorted in a descending manner according to their corresponding

eigenvalues

Figure 2. Pseudocode for training Grassmannian graph-

embedding discriminant analysis.

Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching

Figures

Citations

Domain adaptation for object recognition: An unsupervised approach

Trunk-Branch Ensemble Convolutional Neural Networks for Video-Based Face Recognition

Patch-based probabilistic image quality assessment for face selection and improved video-based face recognition

Projection Metric Learning on Grassmann Manifold with Application to Video based Face Recognition

Log-Euclidean Metric Learning on Symmetric Positive Definite Manifold with Application to Image Set Classification

References

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning

Eigenfaces vs. Fisherfaces: recognition using class specific linear projection

Kernel Methods for Pattern Analysis

On combining classifiers

Related Papers (5)

Grassmann discriminant analysis: a unifying view on subspace-based learning

Face recognition based on image sets

Discriminative Learning and Recognition of Image Set Classes Using Canonical Correlations

Covariance discriminative learning: A natural and efficient approach to image set classification

Face tracking and recognition with visual constraints in real-world videos

Frequently Asked Questions (12)

Q1. What have the authors contributed in "Graph embedding discriminant analysis on grassmannian manifolds for improved image set matching author" ?

Q2. What have the authors stated for future works in "Graph embedding discriminant analysis on grassmannian manifolds for improved image set matching author" ?

Q3. How can the authors achieve better performance by modelling image sets?

Q4. How many values of k[proj+CC] were found?

Q5. How can a Riemannian manifold be treated as a structure in differential?

Q6. What is the simplest way to get a Grassmannian mapping?

Q7. What is the simplest way to solve the Grassmannian kernel problem?

Q8. What is the main difference between the proposed method and the proposed kernel?

Q9. What is the size of the orthonormal matrices?

Q10. How can the authors get richer descriptions on Grassmannian manifolds?

Q11. What is the difference between the projection kernel and the canonical correlation kernel?

Q12. What is the main difference between the two methods?