HAL Id: inria-00134950
https://hal.inria.fr/inria-00134950
Submitted on 6 Mar 2007
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
A review of classication algorithms for EEG-based
brain–computer interfaces
Fabien Lotte, Marco Congedo, Anatole Lécuyer, Fabrice Lamarche, Bruno
Arnaldi
To cite this version:
Fabien Lotte, Marco Congedo, Anatole Lécuyer, Fabrice Lamarche, Bruno Arnaldi. A review of
classication algorithms for EEG-based brain–computer interfaces. Journal of Neural Engineering,
IOP Publishing, 2007, 4, pp.24. �inria-00134950�
TOPICAL REVIEW
A Review of Classification Algorithms for
EEG-based Brain-Computer Interfaces
F Lotte
1
, M Congedo
2
, A L´ecuyer
1
, F Lamarche
1
and B
Arnaldi
1
1
IRISA / INRIA Rennes, Campus universitaire de Beaulieu, Avenue du G´en´eral
Leclerc, 35042 RENNES Cedex, France
2
France Telecom R&D, Tech/ONE Laboratory, 28 Chemin du vieux Chˆene,InoVall´ee,
38240 Grenoble, France
E-mail: fabien.lotte@irisa.fr
Abstract. In this paper we review classification algorithms used to design Brain-
Computer Interface (BCI) systems based on ElectroEncephaloGraphy (EEG). We
briefly present the commonly employed algorithms and describe their critical
properties. Based on the literature, we compare them in terms of performance and
provide guidelines to choose the suitable classification algorithm(s) for a specific BCI.
PACS numbers: 8435, 8780
1. Introduction
A Brain-Computer Interface (BCI) is a communication system that does not require
any peripheral muscular activity [1]. Indeed, BCI systems enable a subject to send
commands to an electronic device only by means of brain activity [2]. Such interfaces
can be considered as being the only way of communication for people affected by a
number of motor disabilities [3].
In order to control a BCI, the user must produce different brain activity patterns
that will be identified by the system and translated into commands. In most existing
BCI, this identification relies on a classification algorithm [4], i.e., an algorithm that aims
at automatically estimating the class of data as represented by a feature vector [5]. Due
to the rapidly growing interest for EEG-based BCI, a considerable number of published
results is related to the investigation and evaluation of classification algorithms. To
date, very interesting reviews of BCI have been published [1] [6] but none has been
specifically dedicated to the review of classification algorithms used for BCI, their
properties and their evaluation. This paper aims at filling this lack. Therefore, one of
the main objectives of this paper is to survey the different classification algorithms used
in EEG-based BCI research and to identify their critical properties. Another objective
is to provide guidelines in order to help the reader with choosing the most appropriate
A Review of Classification Algorithms for EEG-based Brain-Computer Interfaces 2
classification algorithm for a given BCI experiment. This amounts to comparing the
algorithms and assessing their performances according to the context.
This paper is organized as follows: Section 2 depicts a BCI as a pattern recognition
system and emphasizes the role of classification. Section 3 surveys the classification
algorithms used for BCI and finally, Section 4 assesses them and identifies their usability
depending on the context.
2. Brain-Computer Interfaces seen as a pattern recognition system
The very aim of BCI is to translate brain activity into a command for a computer. To
achieve this goal, either regression [7] or classification [8] algorithms can be used. Using
classification algorithms is the most popular approach. These algorithms are used to
identify “patterns” of brain activity [4]. In this paper, we consider a BCI system as
a pattern recognition system [5] [9] and focus on the classification algorithms used to
design them. The performance of a pattern recognition depends on both the features
and the classification algorithm employed. These two components are highlighted in
this section.
2.1. Feature extraction for BCI
In order to select the most appropriate classifier for a given BCI system, it is essential to
clearly understand what features are used, what their properties are and how they are
used. This section aims at describing the common BCI features and more particularly
their properties as well as the way to use them in order to consider time variations of
EEG.
2.1.1. Feature properties
A great variety of features have been attempted to design BCI such as amplitude
values of EEG signals [10], Band Powers (BP) [11], Power Spectral Density (PSD) values
[12] [13], AutoRegressive (AR) and Adaptive AutoRegressive (AAR) parameters [8] [14],
Time-frequency features [15] and inverse model-based features [16] [17] [18]. Concerning
the design of a BCI system, some critical properties of these features must be considered:
• noise and outliers: BCI features are noisy or contain outliers because EEG signals
have a poor signal-to-noise ratio;
• high dimensionality: In BCI systems, feature vectors are often of high
dimensionality, e.g., [19]. Indeed, several features are generally extracted from
several channels and from several time segments before being concatenated into a
single feature vector (see next section);
• time information: BCI features should contain time information as brain activity
patterns are generally related to specific time variations of EEG (see next section);
A Review of Classification Algorithms for EEG-based Brain-Computer Interfaces 3
• non-stationarity: BCI features are non-stationary since EEG signals may rapidly
vary over time and more especially over sessions;
• small training sets: The training sets are relatively small, since the training
process is time consuming and demanding for the subjects.
These properties are verified for most features currently used in BCI research.
However, it should be noted that it may no longer be true for BCI used in clinical
practice. For instance, the training sets obtained for a given patient would not be small
anymore as a huge quantity of data would have been acquired during sessions performed
over days and months. As the use of BCI in clinical pratice is still very limited [3], this
paper deals with classification methods used in BCI research. However, the reader
should be aware that problems may be different for BCI used outside the laboratories.
2.1.2. Considering time variations of EEG
Most brain activity patterns used to drive BCI are related to particular time
variations of EEG, possibly in specific frequency bands [1]. Therefore, the time course
of EEG signals should be taken into account during feature extraction [20]. To use this
temporal information, three main approaches have been proposed:
• concatenation of features from different time segments: It consists in
extracting features from several time segments and concatenating them into a single
feature vector [11] [20];
• combination of classifications at different time segments: It consists in
performing the feature extraction and classification steps on several time segments
and then combining the results of the different classifiers [21] [22];
• dynamic classification: It consists in extracting features from several time
segments to build a temporal sequence of feature vectors. This sequence can be
classified using a dynamic classifier [20] [23] (see Section 2.2.1).
The first approach is the most widely used, which explains why feature vectors are
often of high dimensionality.
2.2. Classification algorithms
In order to choose the most appropriate classifier for a given set of features, the properties
of the available classifiers must be known. This section provides a classifier taxonomy. It
also deals with two classification problems especially relevant for BCI research, namely,
the curse-of-dimensionality and the Bias-Variance tradeoff.
2.2.1. Classifier taxonomy
Several definitions are commonly used to describe the different kinds of available
classifiers:
A Review of Classification Algorithms for EEG-based Brain-Computer Interfaces 4
Generative-discriminative:
Generative (also known as informative) classifiers, e.g., Bayes quadratic, learn
the class models. To classify a feature vector, generative classifiers compute the
likelihood of each class and choose the most likely. Discriminative ones, e.g.,
Support Vector Machines, only learn the way of discriminating the classes or the
class membership in order to classify a feature vector directly [24] [25];
Static-dynamic:
Static classifiers, e.g., MultiLayer Perceptrons, cannot take into account temporal
information during classification as they classify a single feature vector. On the
contrary, dynamic classifiers, e.g., Hidden Markov Model, can classify a sequence
of feature vectors and thus, catch temporal dynamics [26].
Stable-unstable:
Stable classifiers, e.g., Linear Discriminant Analysis, have a low complexity (or
capacity [27]). They are said stable as small variations in the training set does not
affect considerably their performance. On the contrary, unstable classifiers, e.g.,
MultiLayer Perceptron, have a high complexity. As for them, small variations of
the training set may lead to important changes in performances [28].
Regularized:
Regularization consists in carefully controlling the complexity of a classifier in
order to prevent overtraining. A regularized classifier has good generalization
performances and is more robust with respect to outliers [5] [9].
2.2.2. Main classification problems in BCI research
While performing a pattern recognition task, classifiers may be facing several
problems related to the features properties such as outliers, overtraining, etc. In the
field of BCI, two main problems need to be underlined: the curse-of-dimensionality and
the Bias-Variance tradeoff.
The curse-of-dimensionality:
the amount of data needed to properly describe the different classes increases ex-
ponentially with the dimensionality of the feature vectors [9] [29]. Actually, if the
number of training data is small compared to the size of the feature vectors, the
classifier will most probably give poor results. It is recommended to use, at least,
five to ten times as many training samples per class as the dimensionality [30] [31].
Unfortunatly this cannot be applied in all BCI systems as generally, the dimension-
ality is high and the training set small (see section 2.1.1). Therefore this “curse” is
a major concern in BCI design.
The Bias-Variance tradeoff: