scispace - formally typeset
Open AccessJournal ArticleDOI

Regularizing Common Spatial Patterns to Improve BCI Designs: Unified Theory and New Algorithms

TLDR
Results showed that the best RCSP methods can outperform CSP by nearly 10% in median classification accuracy and lead to more neurophysiologically relevant spatial filters and enable us to perform efficient subject-to-subject transfer.
Abstract
One of the most popular feature extraction algorithms for brain-computer interfaces (BCI) is common spatial patterns (CSPs). Despite its known efficiency and widespread use, CSP is also known to be very sensitive to noise and prone to overfitting. To address this issue, it has been recently proposed to regularize CSP. In this paper, we present a simple and unifying theoretical framework to design such a regularized CSP (RCSP). We then present a review of existing RCSP algorithms and describe how to cast them in this framework. We also propose four new RCSP algorithms. Finally, we compare the performances of 11 different RCSP (including the four new ones and the original CSP), on electroencephalography data from 17 subjects, from BCI competition datasets. Results showed that the best RCSP methods can outperform CSP by nearly 10% in median classification accuracy and lead to more neurophysiologically relevant spatial filters. They also enable us to perform efficient subject-to-subject transfer. Overall, the best RCSP algorithms were CSP with Tikhonov regularization and weighted Tikhonov regularization, both proposed in this paper.

read more

Content maybe subject to copyright    Report

HAL Id: inria-00476820
https://hal.inria.fr/inria-00476820v4
Submitted on 24 Sep 2010
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Regularizing Common Spatial Patterns to Improve BCI
Designs: Unied Theory and New Algorithms
Fabien Lotte, Cuntai Guan
To cite this version:
Fabien Lotte, Cuntai Guan. Regularizing Common Spatial Patterns to Improve BCI Designs: Unied
Theory and New Algorithms. IEEE Transactions on Biomedical Engineering, Institute of Electrical
and Electronics Engineers, 2011, 58 (2), pp.355-362. �inria-00476820v4�

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. X, NO. Y, MONTH 2010 1
Regularizing Common Spatial Patterns to Improve
BCI Designs: Unified Theory and New Algorithms
Fabien LOTTE
, Member, IEEE, Cuntai GUAN, Senior Member, IEEE
Abstract—One of the most popular feature extraction algo-
rithms for Brain-Computer Interfaces (BCI) is Common Spatial
Patterns (CSP). Despite its known efficiency and widespread use,
CSP is also known to be very sensitive to noise and prone to
overfitting. To address this issue, it has been recently proposed
to regularize CSP. In this paper, we present a simple and
unifying theoretical framework to design such a Regularized
CSP (RCSP). We then present a review of existing RCSP
algorithms, and describe how to cast them in this framework.
We also propose 4 new RCSP algorithms. Finally, we compare
the performances of 11 different RCSP (including the 4 new
ones and the original CSP), on EEG data from 17 subjects,
from BCI competition data sets. Results showed that the best
RCSP methods can outperform CSP by nearly 10% in median
classification accuracy and lead to more neurophysiologically
relevant spatial filters. They also enable us to perform efficient
subject-to-subject transfer. Overall, the best RCSP algorithms
were CSP with Tikhonov Regularization and Weighted Tikhonov
Regularization, both proposed in this paper.
Index Terms—brain-computer interfaces (BCI), EEG, common
spatial patterns (CSP), regularization, subject-to-subject transfer
I. INTRODUCTION
Brain-Computer Interfaces (BCI) are communication sys-
tems which enable users to send commands to computers by
using brain activity only, this activity being generally measured
by ElectroEncephaloGraphy (EEG) [1]. BCI are generally
designed according to a pattern recognition approach, i.e., by
extracting features from EEG signals, and by using a classifier
to identify the user’s mental state from such features [1][2].
The Common Spatial Patterns (CSP) algorithm is a feature
extraction method which can learn spatial filters maximizing
the discriminability of two classes [3][4]. CSP has been proven
to be one of the most popular and efficient algorithms for BCI
design, notably during BCI competitions [5][6].
Despite its popularity and efficiency, CSP is also known
to be highly sensitive to noise and to severely overfit with
small training sets [7][8]. To address these drawbacks, a recent
idea has been to add prior information into the CSP learning
process, under the form of regularization terms [9][10][11][12]
(see Section IV-A for a review). These Regularized CSP
(RCSP) have all been shown to outperform classical CSP.
However, they are all expressed with different formulations
and therefore lack a unifying regularization framework. More-
over, they were only compared to standard CSP, and typically
with 4 or 5 subjects only [9][10][11], which makes it difficult
to assess their relative performances. Finally, we believe that
a variety of other priors could be incorporated into CSP.
F. Lotte and C. Guan are with the Institute for Infocomm Research, 1 Fu-
sionopolis Way, 138632, Singapore. e-mail: fprlotte,ctguan@i2r.a-star.edu.sg
Therefore, in this paper we present a simple theoretical
framework that could unify RCSP algorithms. We present
existing RCSP within this unified framework as well as 4 new
RCSP algorithms, based on new priors. It should be mentioned
that preliminary studies of 2 of these new algorithms have been
presented in conference papers [12][13]. Finally, we compare
these various algorithms on EEG data from 17 subjects, from
publicly available BCI competition data sets.
This paper is organized as follows: Section II describes the
CSP algorithm while Section III presents the theoretical frame-
work to regularize it. Section IV expresses existing RCSP
within this framework and presents 4 new RCSP. Finally,
Sections V and VI describe the evaluations performed and
their results, and conclude the paper, respectively.
II. THE CSP ALGORITHM
CSP aims at learning spatial filters which maximize the
variance of band-pass filtered EEG signals from one class
while minimizing their variance from the other class [4][3].
As the variance of EEG signals filtered in a given frequency
band corresponds to the signal power in this band, CSP aims
at achieving optimal discrimination for BCI based on band
power features [3]. Formally, CSP uses the spatial filters w
which extremize the following function:
J(w) =
w
T
X
T
1
X
1
w
w
T
X
T
2
X
2
w
=
w
T
C
1
w
w
T
C
2
w
(1)
where T denotes transpose, X
i
is the data matrix for class
i (with the training samples as rows and the channels as
columns) and C
i
is the spatial covariance matrix from class i,
assuming a zero mean for EEG signals. This last assumption is
generally met when EEG signals are band-pass filtered. This
optimization problem can be solved (though this is not the
only way) by first observing that the function J (w) remains
unchanged if the filter w is rescaled. Indeed J(kw) = J (w),
with k a real constant, which means the rescaling of w is arbi-
trary. As such, extremizing J(w) is equivalent to extremizing
w
T
C
1
w subject to the constraint w
T
C
2
w = 1 as it is always
possible to find a rescaling of w such that w
T
C
2
w = 1. Using
the Lagrange multiplier method, this constrained optimization
problem amounts to extremizing the following function:
L(λ, w) = w
T
C
1
w λ(w
T
C
2
w 1) (2)
The filters w extremizing L are such that the derivative of L
with respect to w equals 0:
L
w
= 2w
T
C
1
2λw
T
C
2
= 0
C
1
w = λC
2
w
C
1
2
C
1
w = λw

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. X, NO. Y, MONTH 2010 2
We obtain a standard eigenvalue problem. The spatial filters
extremizing Eq. 1 are then the eigenvectors of M = C
1
2
C
1
which correspond to its largest and lowest eigenvalues. When
using CSP, the extracted features are the logarithm of the EEG
signal variance after projection onto the filters w.
III. REGULARIZED CSP: THEORY
As mentioned above, to overcome the sensitivity of CSP
to noise and overfitting, one should regularize it. Adding prior
information to CSP, and thus regularizing it, can be done at two
levels. First, it can be done at the covariance matrix estimation
level. Indeed, CSP relying on covariance matrix estimates,
such estimates can suffer from noise or small training sets, and
thus benefit from regularization. Another approach consists in
regularizing CSP at the level of the objective function (Eq.
1), by imposing priors on the spatial filters to obtain. The
remaining of this section presents these two approaches.
A. Regularizing the covariance matrix estimates
CSP requires to estimate the spatial covariance matrix
for each class. However, if the EEG training set is noisy
and/or small, these covariance matrices may be poor or non-
representative estimates of the mental states involved, and thus
lead to poor spatial filters. Therefore, it is appropriate to add
prior information to these estimates by using regularization
terms. Based on [10], it can be performed as follows:
˜
C
c
= (1 γ)
ˆ
C
c
+ γI (3)
with
ˆ
C
c
= (1 β)s
c
C
c
+ βG
c
(4)
where C
c
is the initial spatial covariance matrix for class
c,
˜
C
c
is the regularized estimate, I is the identity matrix,
s
c
is a constant scaling parameter (a scalar), γ and β are
two user-defined regularization parameters (γ, β [0, 1])
and G
c
is a ”generic” covariance matrix (see below). Two
regularization terms are involved here. The first one, associated
to γ, shrinks the initial covariance matrix estimate towards
the identity matrix, to counteract a possible estimation bias
due to a small training set. The second term, associated to β,
shrinks the initial covariance matrix estimate towards a generic
covariance matrix, to obtain a more stable estimate. This
generic matrix represents a given prior on how the covariance
matrix for the mental state considered should be. This matrix
is typically built by using signals from several subjects whose
EEG data has been recorded previously. This has been shown
to be an effective way to perform subject-to-subject transfer
[11][10][13]. However, it should be mentioned that G
c
could
also be defined based on neurophysiological priors only.
Learning spatial filters with this method simply consists in
replacing the covariance matrices C
1
and C
2
used in CSP by
their regularized estimates
˜
C
1
and
˜
C
2
. Many different RCSP
algorithms can be thus designed, depending on whether one
or both regularization terms are used, and more importantly,
on how the generic covariance matrix G
c
is built.
B. Regularizing the CSP objective function
Another approach to obtain regularized CSP algorithms
consists in regularizing the CSP objective function itself
(Eq. 1). More precisely, such a method consists in adding
a regularization term to the CSP objective function in order
to penalize solutions (i.e., resulting spatial filters) that do not
satisfy a given prior. Formally, the objective function becomes:
J
P
1
(w) =
w
T
C
1
w
w
T
C
2
w + αP (w)
(5)
where P (w) is a penalty function measuring how much the
spatial filter w satisfies a given prior. The more w satisfies
it, the lower P (w). Hence, to maximize J
P
1
(w), we must
minimize P (w), thus ensuring spatial filters satisfying the
prior. α is a user-defined regularization parameter (α 0, the
higher α, the more satisfied the prior). With this regularization,
we expect that enforcing specific solutions, thanks to priors,
will guide the optimization process towards good spatial filters,
especially with limited or noisy training data.
In this paper, we focus on quadratic penalties: P (w) =
kwk
2
K
= w
T
Kw, where matrix K encodes the prior. Interest-
ingly enough, RCSP with non-quadratic penalties have been
proposed [14][15]. They used an l
1
norm penalty to select
a sparse set of channels. However, these studies showed that
sparse CSP generally gave lower performances than CSP (with
all channels), although they require much less channels, hence
performing efficient channel reduction. As the focus of this
paper is not channel reduction but performance enhancement,
we only consider quadratic penalties here. Moreover, quadratic
penalties lead to a close form solution for optimization (see
below), which is more convenient and computationally effi-
cient. With a quadratic penalty term, Eq. 5 becomes:
J
P
1
(w) =
w
T
C
1
w
w
T
C
2
w + αw
T
Kw
=
w
T
C
1
w
w
T
(C
2
+ αK)w
The corresponding Lagrangian is:
L
P
1
(λ, w) = w
T
C
1
w λ(w
T
(C
2
+ αK)w 1) (6)
By following the same approach as previously (see Section
II), we obtain the following eigenvalue problem:
(C
2
+ αK)
1
C
1
w = λw (7)
Thus, the filters w maximizing J
P
1
(w) are the eigenvectors
corresponding to the largest eigenvalues of M
1
= (C
2
+
αK)
1
C
1
. With CSP, the eigenvectors corresponding to both
the largest and smallest eigenvalues of M (see Section II)
are used as the spatial filters, as they respectively maximize
and minimize Eq. 1 [4]. However, for RCSP, the eigenvectors
corresponding to the lowest eigenvalues of M
1
minimize Eq.
5, and as such maximize the penalty term (which should be
minimized). Therefore, in order to obtain the filters which
maximize C
2
while minimizing C
1
, we also need to maximize
the following objective function:
J
P
2
(w) =
w
T
C
2
w
w
T
C
1
w + αP (w)
(8)

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. X, NO. Y, MONTH 2010 3
which is achieved by using the eigenvectors corresponding
to the largest eigenvalues of M
2
= (C
1
+ αK)
1
C
2
as the
filters w. In other words, with RCSP, the spatial filters used
are the eigenvectors corresponding to the largest eigenvalues of
M
1
and to the largest eigenvalues of M
2
. With this approach,
various regularized CSP algorithms can be designed depending
on the knowledge encoded into matrix K.
C. Summary
We have presented two theoretical approaches to design
RCSP algorithms: one at the covariance matrix estimation
level and one at the objective function level. Naturally, these
two approaches are not exclusive and can be combined within
the same framework. Table I summarizes this framework and
highlights the differences between CSP and RCSP. With this
framework, many different RCSP can be designed depending
on 1) which of the 3 regularization terms (associated to α, β
and γ) is (are) used and on 2) how the matrices G
c
and K
are built. The following section presents several such variants,
including existing algorithms as well as 4 new ones.
TABLE I
DIFFERENCES IN OBJECTIVE FUNCTION AND ALGORITHM OPTIMIZATION
BETWEEN A STANDARD CSP AND A REGULARIZED CSP (RCSP).
CSP RCSP
J(w) =
w
T
C
1
w
w
T
C
2
w
J
P
{1,2}
(w) =
w
T
˜
C
{1,2}
w
w
T
˜
C
{2,1}
w + αP (w)
Objective with
function P (w) = w
T
Kw
˜
C
c
= (1 γ)
ˆ
C
c
+ γI
ˆ
C
c
= (1 β)s
c
C
c
+ βG
c
eigenvectors eigenvectors corresponding
Solutions corresponding to the N
f
largest
of the to the N
f
largest eigenvalues of
optimization and N
f
lowest M
1
= (
˜
C
2
+ αK)
1
˜
C
1
problem eigenvalues and
of M = C
1
2
C
1
M
2
= (
˜
C
1
+ αK)
1
˜
C
2
IV. REGULARIZED CSP: ALGORITHMS
A. Existing RCSP algorithms
Four RCSP algorithms have been proposed so far:
Composite CSP, Regularized CSP with Generic Learning,
Regularized CSP with Diagonal Loading and invariant CSP.
They are described below within the presented framework.
1) Composite CSP:
The Composite CSP (CCSP) algorithm, proposed by Kang
et al [11], aims at performing subject-to-subject transfer by
regularizing the covariance matrices using other subjects’ data.
Expressed within the framework of this paper, CCSP uses only
the β hyperparameter (α = γ = 0), and defines the generic
covariance matrices G
c
according to covariance matrices of
other subjects. Two methods were proposed to build G
c
.
With the first method, denoted here as CCSP1, G
c
is built
as a weighted sum of the covariance matrices (corresponding
to the same mental state) of other subjects, by de-emphasizing
covariance matrices estimated from fewer trials:
G
c
=
X
i
N
i
c
N
t,c
C
i
c
and s
c
=
N
c
N
t,c
(9)
where is a set of subjects whose data is available, C
i
c
is the
spatial covariance matrix for class c and subject i, N
i
c
is the
number of EEG trials used to estimate C
i
c
, N
c
is the number of
EEG trials used to estimate C
c
(matrix for the target subject),
and N
t,c
is the total number of EEG trials for class c (from
all subjects in together with the target subject).
With the second method, denoted as CCSP2, G
c
is still a
weighted sum of covariance matrices from other subjects, but
the weights are defined according to the Kullback-Leibler (KL)
divergence between subjects’ data:
G
c
=
X
i
1
Z
1
KL(i, t)
C
i
c
with Z =
X
j
1
KL(j, t)
(10)
where KL(i, t) is the KL-divergence between the target sub-
ject t and subject i, and is defined as follows:
KL(i, t) =
1
2
log
det(C
c
)
det(C
i
c
)
+ tr(C
1
c
C
i
c
) N
e
(11)
where det and tr are respectively the determinant and the
trace of a matrix, and N
e
is the number of electrodes used.
With CCSP2, the scaling constant s
c
is equal to 1.
2) Regularized CSP with Generic Learning:
The RCSP approach with Generic Learning, proposed by
Lu et al [10] and denoted here as GLRCSP, is another
approach which aims at regularizing the covariance matrix
estimation using data from other subjects. GLRCSP uses
both the β and γ regularization terms, i.e., it aims at
shrinking the covariance matrix towards both the identity
matrix and a generic covariance matrix G
c
. Similarly to
CCSP, G
c
is here computed from the covariance matrices
of other subjects such that G
c
= s
G
P
i
C
i
c
, where
s
c
= s
G
=
1
(1β)M
C
c
+β
P
i
M
C
i
c
, and M
C
is the number of
trials used to compute the covariance matrix C.
3) Regularized CSP with Diagonal Loading:
Another form of covariance matrix regularization used in the
BCI literature is Diagonal Loading (DL), which consists in
shrinking the covariance matrix towards the identity matrix.
Thus, this approach only uses the γ regularization parameter
(β = α = 0). Interestingly enough, in this case the value of
γ can be automatically identified using Ledoit and Wolfs
method [16]. We denote this RCSP based on automatic
DL as DLCSPauto. In order to check the efficiency of this
automatic regularization for discrimination purposes, we
will also investigate a classical selection of γ using cross-
validation. We denote the resulting algorithm as DLCSPcv.
When using Ledoit and Wolfs method for automatic
regularization, the value of γ selected to regularize C
1
can
be different than that selected to regularize C
2
. Therefore,
we also investigated cross-validation to select a potentially
different regularization parameter for C
1
and C
2
. We denote
this method as DLCSPcvdiff. To summarize, DLCSPauto
automatically selects two γ regularization parameters (one
for C
1
and one for C
2
) ; DLCSPcv selects a single γ
regularization parameter for both C
1
and C
2
using cross
validation ; finally, DLCSPcvdiff selects two γ regularization
parameters (one for C
1
and one for C
2
) using cross

IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, VOL. X, NO. Y, MONTH 2010 4
validation. It should be mentioned that, although covariance
matrix regularization based on DL has been used in the BCI
literature (see, e.g., [17]), to our best knowledge, it has not
been used for CSP regularization, but for regularization of
other algorithms such as Linear Discriminant Analysis (LDA).
4) Invariant CSP:
Invariant CSP (iCSP), proposed by Blankertz et al [9], aims at
regularizing the CSP objective function in order to make filters
invariant to a given noise source (it uses β = γ = 0). To do
so, the regularization matrix K is defined as the covariance
matrix of this noise source, e.g., as the covariance matrix of the
changing level of occipital α-activity. It should be mentioned
that, to obtain this noise covariance matrix, additional EEG
measurements must be performed to acquire the corresponding
EEG signals and compute their covariance matrix. Since such
measurements are not available for the EEG data sets analyzed
here, iCSP will not be considered for evaluation in this paper.
However, it still seems to be an efficient approach to make
CSP robust against known noise sources.
B. New RCSP algorithms
In this section, we propose 4 new algorithms to regularize
CSP: a CSP regularized with selected subjects, a Tikhonov
Regularized CSP, a weighted Tikhonov Regularized CSP and
a spatially Regularized CSP.
1) Regularized CSP with Selected Subjects:
This first new RCSP belongs to the same family as CCSP
since it uses data from other subjects to shrink the covariance
matrix towards a generic matrix G
c
(it uses β 0 and α =
γ = 0). However, contrary to CCSP or GLRCSP, the proposed
algorithm does not use the data from all available subjects but
only from selected subjects. Indeed, even if data from many
subjects is available, it may not be relevant to use all of them,
due to potentially large inter-subject variabilities. Thus, we
propose to build G
c
from the covariance matrices of a subset of
selected subjects. We therefore denote this algorithm as RCSP
with Selected Subjects or SSRCSP. With SSRCSP, the generic
covariance matrix is defined as G
c
=
1
|S
t
(Ω)|
P
iS
t
(Ω)
C
i
c
,
where |A| is the number of elements in set A and S
t
(Ω) is
the subset of selected subjects from .
To select an appropriate subset of subjects S
t
(Ω),
we propose the subject selection algorithm described
in Algorithm 1. In this algorithm, the function
accuracy = trainT henT est(trainingSet, testingSet)
returns the accuracy obtained when training an SSRCSP
with β = 1 (i.e., using only data from other subjects) on
the data set trainingSet and testing it on data set testingSet,
with an LDA as classifier. The function (best
i
, max
f(i)
) =
max
i
f(i) returns best
i
, the value of i for which f (i) reaches
its maximum max
f(i)
. In short, this algorithm sequentially
selects the subject to add or to remove from the current subset
of subjects, in order to maximize the accuracy obtained when
training the BCI on the data from this subset of subjects
and testing it on the training data of the target subject. This
algorithm has the same structure as the Sequential Forward
Floating Search algorithm [18], used to select a relevant subset
of features. This ensures the convergence of our algorithm
as well as the selection of a good subset of additional subjects.
Input: D
t
: training EEG data from the target subject.
Input: = {D
s
}, s [0, N
s
]: set of EEG data from the N
s
other subjects available (D
t
3 ).
Output: S
t
(Ω): a subset of relevant subjects whose data can be
used to classify the data D
t
of the target subject.
selected
0
= {};
remaining
0
= ;
accuracy
0
= 0; n = 1;
while n < N
s
do
Step 1: (bestSubject, bestAccuracy) =
max
sremaining
n1
trainT henT est(selected
n1
+
{D
s
}, D
t
);
selected
n
= selected
n1
+ {D
bestSubject
};
remaining
n
= remaining
n1
- {D
bestSubject
};
accuracy
n
= bestAccuracy;
n = n + 1;
Step 2: if n > 2 then
(bestSubject, bestAccuracy) =
max
sselected
n
trainT henT est(selected
n
{D
s
}, D
t
);
if bestAccuracy > accur acy
n1
then
selected
n1
= selected
n
- {D
bestSubject
};
remaining
n1
= remaining
n
+ {D
bestSubject
};
accuracy
n1
= bestAccuracy;
n = n 1;
go to Step 2;
else
go to Step 1;
end
end
end
(bestN, selectedAcc) = max
n[1,N
s
]
accuracy
n
;
S
t
(Ω) = selected
bestN
;
Algorithm 1: Subject selection algorithm for the SSRCSP
(Regularized CSP with Selected Subjects) algorithm.
2) CSP with Tikhonov Regularization:
The next new algorithms we propose are based on the
regularization of the CSP objective function using quadratic
penalties (with α 0, γ = β = 0 and s
c
= 1). The first
one is a CSP with Tikhonov Regularization (TR) or TRCSP.
Tikhonov Regularization is a classical form of regularization,
initially introduced for regression problems [19], and which
consists in penalizing solutions with large weights. The
penalty term is then P (w) = kwk
2
= w
T
w = w
T
Iw. TRCSP
is then simply obtained by using K = I in the proposed
framework (see Table I). Such regularization is expected to
constrain the solution to filters with a small norm, hence
mitigating the influence of artifacts and outliers.
3) CSP with Weighted Tikhonov Regularization:
With TRCSP, high weights are penalized equally for each
channel. However, we know that some channels are more
important than others to classify a given mental state. Thus,
it may be interesting to have different penalties for different
channels. If we believe that a channel is unlikely to have a
large contribution in the spatial filters, then we should give it a
relatively large penalty, in order to prevent CSP from assigning

Citations
More filters
Journal ArticleDOI

Deep learning with convolutional neural networks for EEG decoding and visualization.

TL;DR: This study shows how to design and train convolutional neural networks to decode task‐related information from the raw EEG without handcrafted features and highlights the potential of deep ConvNets combined with advanced visualization techniques for EEG‐based brain mapping.
Journal ArticleDOI

A Review of Classification Algorithms for EEG-based Brain-Computer Interfaces: A 10-year Update

TL;DR: A comprehensive overview of the modern classification algorithms used in EEG-based BCIs is provided, the principles of these methods and guidelines on when and how to use them are presented, and a number of challenges to further advance EEG classification in BCI are identified.
Journal ArticleDOI

EEGNet: A Compact Convolutional Network for EEG-based Brain-Computer Interfaces

TL;DR: In this paper, a compact convolutional network for EEG-based brain computer interfaces (BCI) is proposed, which can learn a wide variety of interpretable features over a range of BCI tasks.
Journal ArticleDOI

Learning Temporal Information for Brain-Computer Interface Using Convolutional Neural Networks

TL;DR: This framework outperforms the best classification method in the literature on the BCI competition IV-2a 4-class MI data set by 7% increase in average subject accuracy and by studying the convolutional weights of the trained networks, it gains an insight into the temporal characteristics of EEG.
References
More filters
Book

The Nature of Statistical Learning Theory

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Journal ArticleDOI

The Nature Of Statistical Learning Theory

TL;DR: As one of the part of book categories, the nature of statistical learning theory always becomes the most wanted book.
Journal ArticleDOI

A review of classification algorithms for EEG-based brain–computer interfaces

TL;DR: This paper compares classification algorithms used to design brain-computer interface (BCI) systems based on electroencephalography (EEG) in terms of performance and provides guidelines to choose the suitable classification algorithm(s) for a specific BCI.
Journal ArticleDOI

A well-conditioned estimator for large-dimensional covariance matrices

TL;DR: This paper introduces an estimator that is both well-conditioned and more accurate than the sample covariance matrix asymptotically, that is distribution-free and has a simple explicit formula that is easy to compute and interpret.
Journal ArticleDOI

Optimal spatial filtering of single trial EEG during imagined hand movement

TL;DR: It is demonstrated that spatial filters for multichannel EEG effectively extract discriminatory information from two populations of single-trial EEG, recorded during left- and right-hand movement imagery.
Related Papers (5)
Frequently Asked Questions (9)
Q1. What contributions have the authors mentioned in the paper "Regularizing common spatial patterns to improve bci designs: unified theory and new algorithms" ?

In this paper, the authors present a simple and unifying theoretical framework to design such a Regularized CSP ( RCSP ). The authors then present a review of existing RCSP algorithms, and describe how to cast them in this framework. The authors also propose 4 new RCSP algorithms. Finally, the authors compare the performances of 11 different RCSP ( including the 4 new ones and the original CSP ), on EEG data from 17 subjects, from BCI competition data sets. Overall, the best RCSP algorithms were CSP with Tikhonov Regularization and Weighted Tikhonov Regularization, both proposed in this paper. 

Future work could deal with investigating performances of RCSP algorithms with very small training sets, so as to reduce BCI calibration time, in the line of their previous studies [ 13 ] [ 29 ]. 

CSP aims at learning spatial filters which maximize the variance of band-pass filtered EEG signals from one class while minimizing their variance from the other class [4][3]. 

CSP relying on covariance matrix estimates, such estimates can suffer from noise or small training sets, and thus benefit from regularization. 

for RCSP, the eigenvectors corresponding to the lowest eigenvalues of M1 minimize Eq. 5, and as such maximize the penalty term (which should be minimized). 

With CSP, the eigenvectors corresponding to both the largest and smallest eigenvalues of M (see Section II) are used as the spatial filters, as they respectively maximize and minimize Eq. 1 [4]. 

in order to obtain the filters which maximize C2 while minimizing C1, the authors also need to maximize the following objective function:JP2(w) = wTC2wwTC1w + αP (w) (8)which is achieved by using the eigenvectors corresponding to the largest eigenvalues of M2 = (C1 + αK)−1C2 as the filters w. 

CSP uses the spatial filters w which extremize the following function:J(w) = wTXT1 X1w wTXT2 X2w = wTC1w wTC2w (1)where T denotes transpose, Xi is the data matrix for class i (with the training samples as rows and the channels as columns) and Ci is the spatial covariance matrix from class i, assuming a zero mean for EEG signals. 

More precisely, such a method consists in adding a regularization term to the CSP objective function in order to penalize solutions (i.e., resulting spatial filters) that do not satisfy a given prior.