scispace - formally typeset
Open AccessJournal ArticleDOI

Transfer Learning: A Riemannian Geometry Framework With Applications to Brain–Computer Interfaces

Reads0
Chats0
TLDR
This paper proposes to affine transform the covariance matrices of every session/subject in order to center them with respect to a reference covariance matrix, making data from different sessions/subjects comparable, providing a significant improvement in the BCI transfer learning problem.
Abstract
Objective: This paper tackles the problem of transfer learning in the context of electroencephalogram (EEG)-based brain–computer interface (BCI) classification. In particular, the problems of cross-session and cross-subject classification are considered. These problems concern the ability to use data from previous sessions or from a database of past users to calibrate and initialize the classifier, allowing a calibration-less BCI mode of operation. Methods: Data are represented using spatial covariance matrices of the EEG signals, exploiting the recent successful techniques based on the Riemannian geometry of the manifold of symmetric positive definite (SPD) matrices. Cross-session and cross-subject classification can be difficult, due to the many changes intervening between sessions and between subjects, including physiological, environmental, as well as instrumental changes. Here, we propose to affine transform the covariance matrices of every session/subject in order to center them with respect to a reference covariance matrix, making data from different sessions/subjects comparable. Then, classification is performed both using a standard minimum distance to mean classifier, and through a probabilistic classifier recently developed in the literature, based on a density function (mixture of Riemannian Gaussian distributions) defined on the SPD manifold. Results: The improvements in terms of classification performances achieved by introducing the affine transformation are documented with the analysis of two BCI datasets. Conclusion and significance: Hence, we make, through the affine transformation proposed, data from different sessions and subject comparable, providing a significant improvement in the BCI transfer learning problem.

read more

Content maybe subject to copyright    Report

HAL Id: hal-01923278
https://hal.archives-ouvertes.fr/hal-01923278
Submitted on 15 Nov 2018
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Transfer learning: a Riemannian geometry framework
with applications to Brain-Computer Interfaces
Paolo Zanini, Marco Congedo, Christian Jutten, Salem Said, Yannick
Berthoumieu
To cite this version:
Paolo Zanini, Marco Congedo, Christian Jutten, Salem Said, Yannick Berthoumieu. Transfer learning:
a Riemannian geometry framework with applications to Brain-Computer Interfaces. IEEE Trans-
actions on Biomedical Engineering, Institute of Electrical and Electronics Engineers, 2018, 65 (5),
pp.1107-1116. �10.1109/TBME.2017.2742541�. �hal-01923278�

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 65(5), 1107-1116 1
Transfer learning: a Riemannian geometry
framework with applications to Brain-Computer
Interfaces
Paolo Zanini, Marco Congedo, Christian Jutten, Salem Said, and Yannick Berthoumieu
Abstract—Objective: This paper tackles the problem of trans-
fer learning in the context of EEG-based Brain Computer
Interface (BCI) classification. In particular the problems of cross-
session and cross-subject classification are considered. These
problems concern the ability to use data from previous sessions
or from a database of past users to calibrate and initialize
the classifier, allowing a calibration-less BCI mode of operation.
Methods: Data are represented using spatial covariance matrices
of the EEG signals, exploiting the recent successful techniques
based on the Riemannian geometry of the manifold of Symmetric
Positive Definite (SPD) matrices. Cross-session and cross-subject
classification can be difficult, due to the many changes intervening
between sessions and between subjects, including physiologi-
cal, environmental, as well as instrumental changes. Here we
propose to affine transform the covariance matrices of every
session/subject in order to center them with respect to a reference
covariance matrix, making data from different sessions/subjects
comparable. Then, classification is performed both using a stan-
dard Minimum Distance to Mean (MDM) classifier, and through
a probabilistic classifier recently developed in the literature,
based on a density function (mixture of Riemannian Gaussian
distributions) defined on the SPD manifold. Results: The im-
provements in terms of classification performances achieved by
introducing the affine transformation are documented with the
analysis of two BCI data sets. Conclusion and significance: Hence,
we make, through the affine transformation proposed, data from
different sessions and subject comparable, providing a significant
improvement in the BCI transfer learning problem.
Index Terms—Brain Computer Interface, electroencephalog-
raphy, covariance matrices, Riemannian geometry, mixtures of
Gaussian.
I. INTRODUCTION
A
Brain Computer Interface (BCI) is a system capable of
predicting or classifying cognitive states and intentions
of the user through the analysis of neurophysiological signals
[24], [32]. Historically, BCIs have been developed to allow
severely paralyzed people to communicate or interact with
their environment without relying on the normal muscular
or peripheral nerve outputs [8]. More recently, BCIs have
been proposed also for healthy people, for instance in driving,
P. Zanini is at Gipsa-Lab, Universit
´
e Grenoble Alpes, France, and IMS,
Universit
´
e de Bordeaux, France (e-mail: paolo.zanini@gipsa-lab.fr).
M. Congedo and C. Jutten are at Gipsa-Lab, Universit
´
e Grenoble Alpes,
France (e-mail: marco.congedo@gipsa-lab.fr; christian.jutten@gipsa-lab.fr).
S. Said and Y. Berthoumieu are at Univ. Bordeaux, Bordeaux INP,
CNRS, IMS, UMR 5218, 33400 Talence, France (e-mail: salem.said@ims-
bordeaux.fr; yannick.berthoumieu@ims-bordeaux.fr).
Copyright (c) 2016 IEEE. Personal use of this material is permitted.
However, permission to use this material for any other purposes must be
obtained from the IEEE by sending an email to pubs-permissions@ieee.org.
forensics, or gaming applications [11], [20], [29]. Several
neurophysiological signals can be used for a BCI, either
invasive or semi-invasive, like electrodes implanted into the
grey matter or sub-durally. Most BCIs however make use of
non-invasive neuroimaging modalities, such as near-infrared
spectroscopy and, especially, electroencephalography (EEG),
which suit both clinical and healthy populations. In this paper
we focus on EEG-based BCIs.
The standard classification technique consists of two opera-
tional stages [9], [18]. First, EEG signals of a training set are
transformed through frequency and/or spatial filters in order
to extract discriminant features [8], [16]. A very popular filter
in this stage is the Common Spatial Pattern (CSP) [18], [19].
Second, the features enter a machine learning algorithm in
order to compute a decision function for performing classifi-
cation on the test set. This is done by supervised techniques
like, for instance, Linear Discriminant Analysis (LDA) [9].
A different approach was presented in [2], where classi-
fication is performed using the signal covariance matrices
as feature of interest. Covariance matrices do not belong
to an Euclidean space, instead they belong to the smooth
Riemannian manifold of Symmetric Positive Definite (SPD)
matrices [5]. Hence, in [2], the properties of SPD manifold are
used to perform BCI classification directly on the manifold,
as illustrated in subsection II-D. In this paper we consider two
separate improvements with respect to the method described
in [2]. The first improvement relates to the classification
techniques. In [2] the authors used a basic classifier, named
Minimum Distance to Mean (MDM), which takes into account
distances on the manifold between the observations and some
reference points of the classes, known as centers of mass,
means, or barycenters. Here we introduce a probabilistic clas-
sifier, modeling the class probability distributions, exploiting
Riemannian Gaussian and mixture of Gaussian distributions
introduced in [34], and applied to EEG classification in [37].
The second improvement relates to the problem of transfer
learning [30]. In the machine learning field, transfer learning is
defined as the ability to use previous knowledge as features in
a new task/domain related to the previous one. Some examples
of transfer learning applied to BCI problem can be found in
[15], [21], [27] and [36]. In this paper we focus specifically on
the problem of cross-session and cross-subject BCI learning.
A classical BCI requires a calibration stage at each run,
even for a known user. The calibration stage, however short,
is inconvenient both for patients, because it wastes part of
their limited attention, and for the general public, which is

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 65(5), 1107-1116 2
usually unwilling to undergo repeated calibration sessions.
As proposed in [12] a BCI should be able to calibrate on-
line while it is being used. The problem is then to provide a
workable initialization, that is, one that allows the operation
of the BCI at the very beginning of the session, even if
suboptimal. For a new user, a database of past users can be
considered to initialize the classifier. This form of learning
is referred to as cross-subject learning. From the second
usage on, past data from previous sessions of the user can
be employed. This is referred to as cross-session learning.
Cross-session learning is known to be a difficult task due to
several changes intervening in between the sessions, including
physiological, environmental, as well as instrumental changes
(e.g., electrode positioning and impedance). Even more dif-
ficult is the cross-subject learning, because the spatial and
temporal configuration of brain dipolar sources is subject to
substantial individual variability. In the Riemannian framework
the cross-session and cross-subject changes can be understood
as geometric transformations of the covariance matrices. In
this work we will refer to this geometric transformation as a
“shift”, although we should keep in mind that a transformation
may entail more than a simple displacement on the manifold.
A first attempt to solve the shift problem is described in
[33], however this work does not consider the structure of
the covariance matrix manifold. In [3], instead, the authors
introduce a way to solve the shift problem in a Riemannian
framework, for the cross-session situation, however this ap-
proach depends on the order of the tasks performed during an
experiment and on the (unknown) structure of the classes in
the classification problem. In this paper we develop an idea
similar to the one presented in [33], but in a Riemannian
framework. Our approach does not depend on the (unknown)
label sequence of the observations obtained during the ex-
periment. We assume that different source configurations and
electrode positions induce shifts of covariance matrices with
respect to a reference (resting) state, but that when the brain
is engaged in a specific task, covariance matrices move over
the SPD manifold in the same direction. This assumption
allows a workable model and a simple solution thanks to
the congruence invariance property of SPD matrices (that
we will describe in subsection II-A). We will center the
covariance matrices of every session/subject with respect to
a reference covariance matrix so that what we observe is only
the displacement with respect to the reference state due to
the task. We estimate a reference matrix for every session,
but different between sessions and between subjects. Then,
we perform a congruent transformation of our data using this
reference matrix. In this way observations belonging to the
same session and subject do not change their relative distances
and geometric structure. However, since the reference matrix
varies among sessions and among subjects, these data are
moved in the manifold in different directions and, if the
reference matrix is chosen accurately, data from different
sessions/subjects become comparable. As we will show with
the analysis of two BCI data sets, this procedure provides
an efficient initialization for cross-session and cross-subject
classification problems.
In EEG-based BCI literature, different kinds of tasks can be
used to design a BCI (see [12] for an exhaustive description).
In this work we analyze two different paradigms in order
to widen the scope of our analysis. The first one relates to
a Motor Imagery (MI) paradigm and the second one to an
Event-Related Potential (ERP) paradigm. For the first dataset
we analyze nine subjects, each one performing two sessions,
and we evaluate the accuracy for cross-session and cross-
subject classification. We obtain significant improvements by
using the proposed procedure, especially for the cross-subject
classification, where we can increase the performance by 30%
in some cases. For the second dataset we analyze 17 subjects
and we evaluate the precision for cross-subject classification.
Also in this case we obtain substantial improvements by
introducing our procedure. Furthermore, for both datasets, we
discuss the situations where the introduction of a probabilistic
classifier can result in further improvements.
The paper is organized as it follows. In Section II basic
concepts of Riemannian geometry are introduced. In Section
III the two BCI paradigms are described in details, focusing
in particular on how to build SPD matrices in the two cases to
be used in a Riemannian framework. Then, in Section IV we
describe the proposed Riemannian transfer learning methods.
In Section V we present the results obtained with the two
datasets analyzed. Finally, we conclude our work in Section
VI.
II. ELEMENTS OF RIEMANNIAN GEOMETRY
In this section we present some basic properties of the space
of SPD matrices, introducing a probabilistic distribution on
this space and defining some classification rules to classify
SPD matrices.
A. Manifold of SPD matrices: basic concepts
We start by introducing M (n) and S(n) as the vector space
of n × n square matrices, and the vector space in M(n) of
symmetric n × n square matrices, respectively. Specifically,
M(n) = {M R
n×n
}, while S(n) = {S M(n), S =
S
T
}. The set of SPD matrices P (n) = {P S(n), u
T
P u >
0 u R
n
, u 6= 0} is an open subset of S(n), in particular
it is an open convex cone of dimension
n(n+1)
2
. P (n) is the
space of covariance matrices and it is our space of interest. If
endowed with the Fisher-Rao metrics [5], P (n) turns out to be
a smooth Riemannian manifold with non positive curvature.
This means that for every point P P (n), in the tangent
space T
P
(that in this case can be identified with S(n)), we
define a scalar product which varies smoothly with P . The
local inner product and, as a consequence, the local norm, are
defined as
hU, V i
P
= tr(P
1
UP
1
V ), (1)
kUk
2
P
= hU, U i
P
,
respectively, where U, V S(n). Through the natural metrics
(1), a distance between two points P
1
, P
2
P (n) can be

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 65(5), 1107-1116 3
defined as the length of the unique shortest curve (called
geodesic) connecting P
1
and P
2
[5]
δ(P
1
, P
2
) = k log(P
1/2
1
P
2
P
1/2
1
)k
F
=
n
X
i=1
log
2
λ
i
!
1/2
,
(2)
with k · k
F
the Frobenius norm, and λ
1
, ..., λ
n
the eigenvalues
of P
1/2
1
P
2
P
1/2
1
(or P
1
1
P
2
, with the indices 1 and 2 that
can be permuted since δ(·, ·) is symmetric). The Riemannian
distance δ(·, ·) has two important invariances:
i. δ(P
1
1
, P
1
2
) = δ(P
1
, P
2
);
ii. δ(C
T
P
1
C, C
T
P
2
C) = δ(P
1
, P
2
) C GL(n),
with GL(n) = {C M(n), C invertible} the set of invertible
matrices. Property ii, called congruence invariance, means
that the distance between two SPD matrices is invariant with
respect to a change of reference, i.e., to any linear invertible
transformation in the data (recordings) space. This property
will be particularly important in the following.
B. Center of mass of a set of SPD matrices
The simplest statistical descriptor of a set of objects is the
concept of mean value, which is meant to provide a suitable
representative of the set. The most famous mean is the arith-
metic mean. It has an important variational characterization:
given a set P
1
, ..., P
N
of SPD matrices, the arithmetic mean
A(P
1
, ..., P
N
) is the point P which minimizes the sum of
squared Euclidian distances d
e
(·, ·)
A(P
1
, ..., P
N
) = arg min
P P (n)
N
X
i=1
d
2
e
(P
i
, P ), (3)
Similarly, it has been shown that we can use the Riemannian
distance to define a geometric mean, or center of mass, of a
set of SPD matrices, through a variational approach [6]. The
center of mass G(P
1
, ..., P
N
) is defined as the point of the
manifold satisfying
G(P
1
, ..., P
N
) = arg min
P P (n)
N
X
i=1
δ
2
(P
i
, P ). (4)
with δ(·, ·) defined in (2). In the literature, (4) is often called
Cartan/Fr
´
echet/Karcher mean [5], [6], [22]. Since P (n) is
a Riemannian manifold of non-positive curvature, existence
and unicity of the Riemannian mean can be proved [1], [28].
However, an explicit solution exists only for N = 2, where
it coincides with the middle point of the geodesic connecting
the two SPD matrices of the set. For N > 2 a solution can
be found iteratively and several algorithms following different
approaches have been developed in the literature [22]. Some
of them try to find the right value through numerical procedure
like deterministic line search [17], [26], simple or stochastic
gradient descent [10], [31]. Other faster and computational
lighter approaches look for some suitable approximation of
the center of mass, see for instance [6], [13], [14].
An important invariance property for the center of mass is:
G(C
T
P
1
C, ..., C
T
P
N
C) = C
T
G(P
1
, ..., P
N
)C C GL(n),
inherited from the congruance invariance of the Riemannian
distance mentioned above. This result means that the center
of gravity is shifted through the same affine transformation as
the matrices of the set.
C. Mixtures of Gaussian distributions on the manifold of SPD
matrices
Distance and center of mass are geometric concepts con-
cerning the properties of the manifold of SPD matrices, but
they do not concern any probabilistic assumptions on a sample
of SPD matrices. To consider a probabilistic model we intro-
duce a class of probability distributions on the space P (n),
called Riemannian Gaussian distributions and defined in [34].
It will be denoted G(
P , σ) and depends on two parameters,
P P (n) and σ > 0. It is defined by its probability density
function
f(P |P , σ) =
1
ζ(σ)
exp
δ
2
(P, P )
σ
2
(5)
where ζ(σ) is a normalization function. In [34] it has been
shown that, given P
1
, ..., P
N
i.i.d. from (5), the Maximum
Likelihood Estimator (MLE) of P coincides with the center of
mass (4). For the MLE of σ, instead, an efficient procedure is
presented in [37]. If we consider only Gaussian distribution,
we are not able to describe a wide range of real problems.
In general in the classical Euclidean framework, in order to
include several distribution shapes, mixtures of Gaussians have
been considered [34]. In the Riemannian framework this is also
possible in a straightforward way. A mixture of Riemannian
Gaussian distributions is a distribution on P (n) whose density
function can be written as
f(P ) =
M
X
m=1
w
m
f(P |P
m
, σ
m
), (6)
with w
1
, ..., w
M
non-negative weights summing up to 1.
The parameters of (6) can be found, for instance, through
an Expectation-Maximization (EM) algorithm, as described
in [34]. This class of distributions will be used to build a
probabilistic classifier for data in P (n), as described in the
next subsection.
D. Classification techniques in the manifold of SPD matrices
In [2] the authors proposed a classification procedure based
on Minimum Distance to Mean (MDM) classifier, which is
defined as it follows: given K classes and a training phase
where the centers of mass
b
C(k) of the classes (k = 1, ..., K)
are estimated, a new observation C
i
is assigned to the
b
k class
according to the classification rule
b
k = arg min
k∈{1,...,K}
{d
R
(C
i
,
b
C(k))}. (7)
This rule takes into consideration the Riemannian distance
of the new observation to the centers of mass, ignoring
information on the dispersion of the groups, encoded by the
parameter σ in the Riemannian Gaussian distribution (5). The
principle of Bayesian classification can be used exploiting such

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 65(5), 1107-1116 4
a distribution. In this case, the classification rule based on the
a posteriori distribution reads
b
k = arg min
k∈{1,...,K}
(
log ζ(bσ(k)) +
d
2
R
(C
i
,
b
C(k))
2 bσ
2
(k)
)
, (8)
where bσ(k) is the MLE estimate of the dispersion parameter
of the k-th class [37]. Of course, if the bσ(k) coincide for all
classes, (8) reduces to (7). In order not to be limited to a simple
class of distributions, we can consider mixtures of Gaussian
(6), updating Bayesian classification rule accordingly. In this
paper we consider a number of mixture components M
varying from 2 to 4.
III. DATA
We analyze two different EEG-based BCI datasets, related
to MI and ERP frameworks. The way to build SPD matrices is
different between the two cases and it is described in subsec-
tion III-A and III-B, respectively. Then, in subsection III-C, we
will show how cross-session and cross-subject classifications
can be problematic, exploiting a visualization technique for
high-dimensional data named t-Stochastic Neighbor Embed-
ding (t-SNE) [35].
A. Motor Imagery: data construction
The analyzed dataset is the one from BCI competition [25],
already analyzed in [2], [18]. It contains EEG data from nine
subjects performing four kinds of motor imagery (right hand,
left hand, foot, and tongue imagined movements). A total of
576 trials per subject are available, each trial corresponding to
a movement (balanced experiment, i.e., 144 trials per class).
Half of the trials (288) are obtained during the first session,
and the other half during a second session. For each trial l we
register the centered EEG signal X
l
R
n×T
, where n is the
number of electrodes and T the number of sample points of
the time window considered to evaluate sample covariance, in
this case from 0.5 to 2.5 seconds after the stimulus. Then we
use for the analysis the empirical covariance matrix defined as
C
X
l
=
1
T 1
X
l
X
T
l
.
In this experiment signals are recorded using 22 electrodes
(n = 22), hence covariance matrices here belong to P (22). As
usual with motor imagery data, before computing covariance
matrices, EEG signals are bandpass filtered by a 5-th order
Butterworth filter in the frequency band of 8 30 Hz.
B. ERP: data construction
This dataset cames from a Brain Invaders experiment car-
ried out at GIPSA-lab in Grenoble, France [11]. Subjects
watch a screen with 36 aliens flashing alternatively. They are
requested to mentally count the number of specific (known)
target alien flashes. This experiment generates in the EEG
signals an Event-Related Potential (ERP) named P300 when-
ever the target alien flashes [11]. The main goal is to detect
the target trials from the EEG signals. Thus, we have two
classes in this experiment, P300 signals (target class) and
normal signals (non target class). In this framework we cannot
simply consider the covariance matrices C
X
l
. Indeed, if we
randomly shuffle the time instants for a specific trial, the
estimate of its covariance matrix does not change, and thus the
classification result. Since temporal information are essential
to detect ERP, we augmented the vector by integrating a
component related to the temporal profile of the ERP event
considered, following the procedure described in [4] and [23].
Specifically, we considered the average ERP response
E =
1
|K
+
|
X
lK
+
X
l
R
n×T
,
where K
+
is the group of target trials (ERP in this case). Then
we built an augmented trial signal matrix
e
X
l
, defined as
e
X
l
=
E
X
l
R
2n×T
,
and then we considered an augmented covariance matrix
e
C
e
X
l
of dimension 2n × 2n:
e
C
e
X
l
=
C
E
C
EX
l
C
X
l
E
C
X
l
.
Relevant information for distinguishing a target from a non-
target trial is embedded in the block C
EX
l
(and in its transpose
C
X
l
E
). In these blocks, entries will be far from zero only
for target trials, since only the time series of target trials are
correlated to the average ERP E. Thus, on the SPD manifold
augmented covariance matrices for target trials will be far
from the augmented covariance matrices for non-target trials.
Notice that if we randomly shuffle the time instants for a
specific trials, the augmented covariance matrix does change,
which means that we have effectively embedded the temporal
information into these matrices. A training-phase is needed
to build the average ERP response. In this experiment we
consider 17 subjects, with a number of trials different from
one subject to another, ranging from 500 to 750. EEG signals
are recorded at a frequency of 512 Hz using 13 electrodes
(i.e. n = 13), hence covariance matrices here belong to P (26).
Every trial is registered for a period of time of one second after
the stimulus (the flash). Thus, augmented covariance matrices
are estimated using 512 observations.
C. Data visualization using t-SNE
The visualization technique called t-SNE [35], visualizes
high-dimensional data by mapping each point to a location
in a 2- or 3-dimensional space, while optimizing the pairwise
distances in the reduced space with respect to the distances in
the original manifold. In our case we aim to represent each
covariance matrix as a point in a 2 dimensional space in order
to appreciate the effect of cross-session and cross-subject shift.
In Figure 1 and 5 the data from the MI experiment are
shown. In each plot of Figure 1, data for the two sessions
are depicted (circles for session 1 and crosses for session 2),
with colors identifying the classes. In Figure 5 a more detailed
representation of subject 9 is depicted, with plots divided by
class. We can observe that data relative to session 2 are shifted
with respect to session 1, for every subject. This means that, in

Figures
Citations
More filters
Journal ArticleDOI

A Review of Classification Algorithms for EEG-based Brain-Computer Interfaces: A 10-year Update

TL;DR: A comprehensive overview of the modern classification algorithms used in EEG-based BCIs is provided, the principles of these methods and guidelines on when and how to use them are presented, and a number of challenges to further advance EEG classification in BCI are identified.
Journal ArticleDOI

Transfer Learning for Brain–Computer Interfaces: A Euclidean Space Data Alignment Approach

TL;DR: Zhang et al. as discussed by the authors proposed an approach to align EEG data from different subjects in the Euclidean space to make them more similar, and hence improve the learning performance for a new subject.
Journal ArticleDOI

A review on transfer learning in EEG signal analysis

TL;DR: Four main methods of transfer learning are described and their practical applications in EEG signal analysis in recent years are explored.
Journal ArticleDOI

Riemannian Procrustes Analysis: Transfer Learning for Brain–Computer Interfaces

TL;DR: A simple yet powerful method for matching the statistical distributions of two datasets, thus paving the way to BCI systems capable of reusing data from previous sessions and avoid the need of a calibration procedure.
Journal ArticleDOI

Transfer Learning for EEG-Based Brain-Computer Interfaces: A Review of Progress Made Since 2016

TL;DR: This article reviews journal publications on TL approaches in EEG-based BCIs in the last few years, i.e., since 2016 and group the TL approaches into cross-subject/session, cross-device, and cross-task settings and review them separately.
References
More filters
Journal Article

Visualizing Data using t-SNE

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.
Journal ArticleDOI

A Survey on Transfer Learning

TL;DR: The relationship between transfer learning and other related machine learning techniques such as domain adaptation, multitask learning and sample selection bias, as well as covariate shift are discussed.
Journal ArticleDOI

Optimizing Spatial filters for Robust EEG Single-Trial Analysis

TL;DR: The theoretical background of the common spatial pattern (CSP) algorithm, a popular method in brain-computer interface (BCD research), is elucidated and tricks of the trade for achieving a powerful CSP performance are revealed.
Book

Positive Definite Matrices

TL;DR: In this paper, the authors present a synthesis of the considerable body of new research into positive definite matrices, which have theoretical and computational uses across a broad spectrum of disciplines, including calculus, electrical engineering, statistics, physics, numerical analysis, quantum information theory and geometry.
Journal ArticleDOI

A Riemannian Framework for Tensor Computing

TL;DR: This paper proposes to endow the tensor space with an affine-invariant Riemannian metric and demonstrates that it leads to strong theoretical properties: the cone of positive definite symmetric matrices is replaced by a regular and complete manifold without boundaries, the geodesic between two tensors and the mean of a set of tensors are uniquely defined.
Related Papers (5)