scispace - formally typeset
Open AccessJournal ArticleDOI

A tutorial on data-driven methods for statistically assessing ERP topographies.

TLDR
A randomization-based procedure that works without assigning grand-mean microstate prototypes to individual data, and shows an increased robustness to noise, and a higher sensitivity for more subtle effects of microstate timing, is proposed.
Abstract
Dynamic changes in ERP topographies can be conveniently analyzed by means of microstates, the so-called "atoms of thoughts", that represent brief periods of quasi-stable synchronized network activation. Comparing temporal microstate features such as on- and offset or duration between groups and conditions therefore allows a precise assessment of the timing of cognitive processes. So far, this has been achieved by assigning the individual time-varying ERP maps to spatially defined microstate templates obtained from clustering the grand mean data into predetermined numbers of topographies (microstate prototypes). Features obtained from these individual assignments were then statistically compared. This has the problem that the individual noise dilutes the match between individual topographies and templates leading to lower statistical power. We therefore propose a randomization-based procedure that works without assigning grand-mean microstate prototypes to individual data. In addition, we propose a new criterion to select the optimal number of microstate prototypes based on cross-validation across subjects. After a formal introduction, the method is applied to a sample data set of an N400 experiment and to simulated data with varying signal-to-noise ratios, and the results are compared to existing methods. In a first comparison with previously employed statistical procedures, the new method showed an increased robustness to noise, and a higher sensitivity for more subtle effects of microstate timing. We conclude that the proposed method is well-suited for the assessment of timing differences in cognitive processes. The increased statistical power allows identifying more subtle effects, which is particularly important in small and scarce patient populations.

read more

Content maybe subject to copyright    Report

ORIGINAL PAPER
A Tutorial on Data-Driven Methods for Statistically
Assessing ERP Topographies
Thomas Koenig
Maria Stein
Matthias Grieder
Mara Kottlow
Received: 2 April 2012 / Accepted: 14 August 2013 / Published online: 29 August 2013
Ó Springer Science+Business Media New York 2013
Abstract Dynamic changes in ERP topographies can be
conveniently analyzed by means of microstates, the so-
called ‘atoms of thoughts’’, that represent brief periods of
quasi-stable synchronized network activation. Comparing
temporal microstate features such as on- and offset or
duration between groups and conditions therefore allows a
precise assessment of the timing of cognitive processes. So
far, this has been achieved by assigning the individual
time-varying ERP maps to spatially defined microstate
templates obtained from clustering the grand mean data
into predetermined numbers of topographies (microstate
prototypes). Features obtained from these individual
assignments were then statistically compared. This has the
problem that the individual noise dilutes the match between
individual topographies and templates leading to lower
statistical power. We therefore propose a randomization-
based procedure that works without assigning grand-mean
microstate prototypes to individual data. In addition, we
propose a new criterion to select the optimal number of
microstate prototypes based on cross-validation across
subjects. After a formal introduction, the method is applied
to a sample data set of an N400 experiment and to simu-
lated data with varying signal-to-noise ratios, and the
results are compared to existing methods. In a first com-
parison with previously employed statistical procedures,
the new method showed an increased robustness to noise,
and a higher sensitivity for more subtle effects of micro-
state timing. We conclude that the proposed method is
well-suited for the assessment of timing differences in
cognitive processes. The increased statistical power allows
identifying more subtle effects, which is particularly
important in small and scarce patient populations.
Keywords Microstates Timing Statistics
Randomization Topography Model selection
Introduction
Scalp recorded evoked potentials permit the non-invasive
mapping of human brain functions at an excellent tem-
poral resolution. This allows for the decomposition of
complex cognitive processes into a sequence of process-
ing stages, each with a different functional significance
(Lehmann 1990; Murray et al. 2008). Importantly, an
unequivocal distinction of ERP components originating
from different brain regions can be obtained by com-
paring the topographies of scalp electromagnetic fields of
the ERP (McCarthy and Wood 1985; Michel et al. 2009).
By identifying and comparing ERP scalp topographies, it
is thus possible to track changes of brain functional
states, where a state is defined globally by a specific
distribution of one or several simultaneously active brain
regions. Spatial analysis of scalp electromagnetic fields
(Lehmann and Skrandies 1984) has moreover the
advantage of being reference independent, as topographic
configurations are not influenced by a reference electrode
(Lehmann 1987).
T. Koenig (&) M. Stein M. Grieder M. Kottlow
Department of Psychiatric Neurophysiology, University Hospital
of Psychiatry, University of Bern, Bern, Switzerland
e-mail: thomas.koenig@puk.unibe.ch
M. Stein
Department of Clinical Psychology and Psychotherapy, Institute
of Psychology, University of Bern, Bern, Switzerland
M. Kottlow
Institute of Pharmacology and Toxicology, University of Zurich,
Zurich, Switzerland
123
Brain Topogr (2014) 27:72–83
DOI 10.1007/s10548-013-0310-1

A commonly used way to compare multichannel ERP
data between groups or conditions is to quantify the dif-
ference of the topography in a given time range and to test
it for significance. Various such methods exist and have
been proven to allow for a sound assessment of topo-
graphic differences in ERPs (Koenig et al. 2011; Lehmann
1987; Lehmann et al. 1993; Nishida et al. 2013; Strik et al.
1998). If the topography of a certain process is known, it is
also possible to quantify the amount of ERP variance that
can be attributed to this process and compare different
datasets based on this quantifier (Brandeis et al. 1992).
Another approach to multichannel ERP analyses are
various kinds of data driven spatio-temporal factor analy-
ses, such as principal component analysis (PCA), inde-
pendent component analysis (ICA), or as discussed in more
detail below, cluster analysis. Factor analyses of multi-
channel ERP data describe an ERP as composed of a
limited set of constant topographies, each with a specific
time course. The comparison of ERPs among different
groups or conditions is then primarily based on a com-
parison of the time-course of selected factors. A good
overview of spatial factor analysis methods (PCA, ICA,
microstates) in comparison to classical ERP approaches is
provided by Pourtois et al. (2008).
While PCA and ICA were primarily based on statistical
arguments such as independence among the factors, the
rationale for using cluster analysis emerged from the
observation of periods of stable field configurations typi-
cally separated by brief moments of rapid transitions
(Lehmann 1990; Wackermann et al. 1993). These periods
of quasi-stable field configurations were called microstates
(Lehmann and Skrandies 1980). They offered a natural,
data-driven and bottom-up definition of a brain functional
state as a period where a quasi-stable field configuration
was observed. Meanwhile, microstate analysis has become
a widely accepted tool for the assessment of the sequence
of functional states in ERPs (see Murray et al. 2008, for a
review). Microstates could also be observed in the elec-
trocorticogram of mice (Megevand et al. 2008). In addition,
it is also possible to identify microstates in the ongoing
resting EEG (Koenig et al. 2002; Lehmann 1990) and
microstate analyses of single trial ERP data have been
proven to be a sensitive and unique tool to track cognitive
processes on a single subject level (De Lucia et al. 2010,
2012; Tzovara et al. 2012a, b, 2013).
Technically, ERP microstate analysis based on spatial
clustering identifies a small set of prototypical ERP
topographies that can be observed in the measured data (so
called microstate class maps) and assigns each time period
of the ERP to exactly one of these microstate class maps
based on a best fit criterion (Murray et al. 2008; Pascual-
Marqui et al. 1995). Whereas the microstate maps corre-
spond to the forward solution of all sources contributing to
a microstate class, the assignment step yields the time of
the on- and offset of the microstates in the ERP. If this
algorithm is used to identify microstates in data consisting
of several experimental conditions or groups, the assign-
ment can be used to identify differences in the timing of a
given microstate class (i.e. onset, offset and duration),
which is a very elegant and efficient way to exploit the
information yielded by the high temporal resolution of the
data.
On the level of statistics, the microstate analyses per-
formed so far have been done by identifying the microstate
maps in ERP datasets averaged over a group of subjects
(grand mean ERPs), but the assignment was then done in
the ERPs of the individuals. From this individual assign-
ments, several parameters were extracted for a given
microstate map, such as the variance explained by the map,
the time when the first or last assignment to the map was
observed, or the total number of time points assigned to the
map. These individual parameters were then entered into
classical, usually parametric, univariate test statistics such
as t tests or ANOVAs (Michel et al.
2009).
While this approach has been applied successfully in a
series of studies (Arzy et al. 2007; Chouiter et al. 2013;
Darque et al. 2012; Knebel and Murray 2012; Kottlow
et al. 2011; Kovalenko et al. 2012; Laganaro and Perret
2011; Overney et al. 2005; Pannekamp et al. 2011; Pegna
et al. 1997; Perret and Laganaro 2012; Pourtois 2011;
Spierer et al. 2007; Stevenson et al. 2012; Taha et al. 2013),
it appeared to the authors that the method can still be
improved to increase statistical power and decrease the
effects of individual variance. Our criticism is that in the
above described approach, the microstate maps are com-
pared to data that has not been directly available to the
clustering algorithm, which obviously impoverishes the
amount of variance explained by the microstate maps.
Furthermore, the individual data contains individual vari-
ance that is usually of little interest, but reduces the topo-
graphic similarity to the microstate maps. We suspect that
this loss of similarity resulting from comparing microstate
maps obtained in grand mean data to individual ERPs may
negatively affect the resulting statistical power.
Our proposal is thus to develop a statistical test for
microstate features where the assignment procedure
remains on the level of the grand mean data. This is
expected to improve the similarity between the microstate
maps and the data these maps are assigned to, and thus
increase the statistical power of the results. For this pur-
pose, we will employ randomization techniques, which
(although computationally expensive) allow custom-tai-
loring statistical tests to such specific problems.
A further aim of the paper is to propose a solution to the
problem of selecting the appropriate number of microstate
maps. This selection has so far been made on criteria
Brain Topogr (2014) 27:72–83 73
123

extracted from grand mean data (Pascual-Marqui et al.
1995), and the individual variance has been neglected. In
general, the aim of model selection procedures (such as
selecting a number of microstate maps) is to choose a
model that captures as much of that part of the data that
follows some generalizable rules, while it is oblivious to
random noise. Our proposal is that in ERP microstate
models, the generalizability of the model can be assessed
by testing it’s consistency across subjects; the parts of the
data that can be observed independently of the individual
subjects belong to the optimal microstate model, while
those parts of the data that depend on the individual sub-
jects should not be part of the model. The optimal model
(i.e. the optimal number of microstate maps) should thus
maximize the amount of explained variance that is inde-
pendent of individual attributes. This criterion can be
evaluated using cross-validation procedures across subjects
(Devijver and Kittler 1982).
In the following methods and results sections, we will
give a detailed explanation of the procedures and apply it
to a real sample dataset and a series of simulated datasets
with defined signal to noise ratios (SNRs). We will then
also analyze the same dataset with the established meth-
odology and compare the results.
As sample data set we chose data of healthy US
American subjects staying in Switzerland for a German
language exchange. EEG was measured while subjects
performed a sentence reading task once at the beginning
and once at a later phase of their stay (Stein et al. 2006).
These sentences ended with semantically correct or incor-
rect endings. Incorrect versus correct sentence endings
have been found to induce a so-called N400 effect which
was described by (Kutas and Hillyard 1980).
Methods
Selection of the Optimal Microstate Model
As outlined in the introduction, we aimed to identify a
microstate model that is sufficiently complex to accom-
modate the part of the data variance that is common across
subjects, while avoiding to account for variance that
appears to be tied to individual attributes. This type of
problems is typically addressed using cross-validation,
where models of different complexity are constructed
based on a subset of the available data, and the resulting
models are then used to make predictions for the remaining
data. Therein, the optimal model is the one that minimizes
the prediction error (Devijver and Kittler 1982).
In the context of microstate modeling, we propose to
implement microstate model selection through cross-vali-
dation by computing microstate models with different
numbers of microstate classes based on ERPs averaged
over a subset of the subjects (training data). These micro-
state models are then tested for their predictive value
(mean correlation) in the ERP’s averaged over the subjects
not included in the construction of the microstate model
(test data). Since the mean correlation of the test data with
a model will depend on the division of the data into
training- and test-sets, this procedure has to be repeated
with different, randomly created subsets of training and test
data. For each number of microstates, the mean correlation
of the test data with the model is then averaged across the
results obtained in the different subsets. The optimal
number of microstates is selected where this grand mean
correlation is maximal.
Note that this procedure contains no measures to mini-
mize the total number of microstates per se, but only
minimizes the number of microstates that cannot be found
consistently across subjects. The encountered number of
microstates therefore does not represent something that
necessarily generalizes across studies, but rather something
that is optimally suited for a dataset with a limited size.
Computationally, the procedure is illustrated in Fig. 1
and is as follows:
1. The algorithm randomly subdivides the subjects into a
training and a test dataset. If the subjects belonged to dif-
ferent groups, each dataset must contain members of all
groups.
2. Grand mean ERPs are computed in the training and
test datasets as a function of group and condition.
3. Spatio-temporal microstate models with different
numbers of microstate maps are computed in the grand
means of the training dataset. This model contains both the
Fig. 1 Flow-chart illustrating the procedure for the selection of the
optimal microstate model
74 Brain Topogr (2014) 27:72–83
123

topographies of the microstate maps as well as the time
instances when these microstate maps are observed.
4. The mean correlation of the test data with each
microstate model is computed (Eq. 1)
Mean correlation ¼
P
nt
t¼1
CorrðV
t
; T
t
Þ
nt
ð1Þ
where t is time, nt is the number of time points, Corr is the
correlation function, V
t
is the voltage vector of the test data
at time t, and T
t
is the voltage vector of the microstate class
observed in the training data at time t. If several conditions
or groups are available, the mean correlation is computed
in each condition and group and averaged.
5. Steps 1–4 are repeated for a sufficient number of
times, and the mean correlations from each run are
retained.
6. The mean correlations are averaged across repetitions
and the number of microstate classes yielding the maxi-
mum mean correlation is identified. This represents the
optimal number of microstate classes for the analysis of the
given dataset.
7. The microstate templates with the optimal number of
classes are now computed using the grand mean ERPs of
all available subjects and conditions.
Once the optimal microstate model has been identified,
we can proceed to the statistical evaluation of the experi-
mental manipulations in the entire dataset.
Statistical Testing of Differences in Microstate Models
As in any statistical testing, an analysis of ERP microstate
features needs to compare an effect (e.g. a difference in the
onset of a given microstate class in the ERPs of two
groups) against the distribution of this effect under the null-
hypothesis. While in classical statistics, this distribution is
estimated based on the variance of the individual data, and
on assumptions about the nature of the distribution, ran-
domization statistics determine this distribution based on
simulations of the effect under the null hypothesis. For our
purposes, the important point here is that with randomi-
zation statistics, we can simulate ERP data under the null-
hypothesis and still compute grand mean ERPs, and
therefore still assess microstate effects based on these
grand means while the null-hypothesis is true.
In general, randomization based statistics consist of the
following three steps (Manly 2007):
1. Quantification of an effect of interest in the measured
data.
2. Creation of cases of the same quantifier compatible
with the null hypothesis. This is achieved by repeat-
edly applying the quantifier to the measured data after
randomizing it in a way that eliminates the suspected
structure in the data.
3. Comparison of the distribution of the quantifier
obtained in the real data with the distribution of the
quantifier under the null-hypothesis.
We will follow this scheme for our microstate statistics,
with the constraint that the assignment procedure shall
always be applied on the level of the grand mean data. The
proposed procedure is also illustrated in Fig. 2.
To quantify the effect of interest (step 1), we propose to
use the previously employed features extracted from the
established microstate assignment procedures (Murray
et al. 2008). These features are specific for a given
microstate map and for the given ERP and include, among
others, the amount of variance explained by the map, the
time point of the first (onset) or last (offset) assignment of
the ERP to that map, or the count of time-points assigned to
the maps. The important difference to the previously pro-
posed method is that in our procedure, these features are
extracted after the microstate maps have been assigned to
group and/or condition specific grand mean data and not to
the individual data. The quantifier of the effect of interest is
then defined by the variance of the feature extracted from
the different groups and/or conditions. For example, in an
analysis of the onset of a language related microstate under
two different conditions, the quantifier of the effect of
interest could be defined as the difference of onset of the
first occurrence of the language related microstate map
between the two conditions (the difference here is equiv-
alent to the variance of the two onsets). If we would
hypothesize that the language related microstate system-
atically differs between three groups of subjects, our
quantifier could for example be the variance among the
onsets obtained from the grand means of each of the three
groups.
For the creation of instances of the chosen quantifier
under the null hypothesis (step 2), we propose to randomize
the ERP data such that the possible suspected structure of
interest in the data is eliminated. For example, if we sup-
pose that semantically expected and unexpected sentence
endings systematically lead to different responses in a
group of subjects, we would construct data with two ran-
dom conditions R1 and R2 and randomly assign, in each
subject, the ERPs of expected sentence endings to either
R1 or R2, and the ERPs of unexpected sentence endings to
the remaining random condition. If we expected that two
groups of subjects (e.g. good and weak learners) differ
systematically, we would randomly shuffle the ERPs of
each subject among the two groups. Once this randomi-
zation has been done, the random group and/or condition
‘specific’ grand means ERPs can be computed, and the
quantifier of interest can again be computed as above. The
Brain Topogr (2014) 27:72–83 75
123

important difference to the previously employed procedure
is again that the microstate assignment necessary for the
feature extraction is computed in grand mean data.
Finally, the quantifier obtained in the measured data in
step 1 is compared to the distribution of the quantifier
obtained under the null hypothesis (step 3). This is done by
simple rank statistics, and the probability of the data being
compatible with the null hypothesis is defined by the pro-
portion of quantifiers obtained under the null-hypothesis
that were larger or equal to the quantifier obtained in the
real data. As an example, let us assume that our first
example above, the difference of onset obtained from the
randomized data was larger than the difference obtained
from the real data in 7 out of 500 cases. The probability
p that the observed difference is compatible with the null
hypothesis is then 7/500 = 0.014, which would (given an
alpha-level of 0.05) indicate that it is significant. If the
variance of the onset of the three groups obtained after
randomizing the data would be larger than the variance
obtained in the real data in 1,293 out of 5,000 randomi-
zation runs, the probability p that the observed group dif-
ferences were obtained by chance is estimated to be 1,293/
5,000 = 0.259, which would typically be considered as
not-significant. Note that the distribution of the quantifier
under the null-hypothesis depends on the precise random
permutations and assignments and may thus vary. The
resulting p value is thus not an exact value, but an estimate.
The literature suggests that for a reliable rejection of the
null-hypothesis on a 5 % level, 1,000 randomization runs are
necessary, and for an estimate at the 1 % level, 5,000
randomization runs are recommended (Manly 2007). In
contrast to parametric methods, statistical tests as the one
described here are ultimately based on rank statistics.
Therefore, they can be expected to be more robust against
false positive results due to biases and outliers in individual
data.
Sample Data Analysis and Simulations
The sample data and analysis are based on an experiment
that has previously been used to demonstrate statistical
procedures of the analysis of ERPs (Koenig et al. 2008,
2011). These data consist of ERPs recorded in 16 healthy
young English-speaking exchange students that spent a
year in the German-speaking part of Switzerland and that
participated in a larger study on the neurobiology of
training-related changes of the language system (Koenig
et al. 2008; Stein et al. 2006). Participants passively viewed
on a computer screen word-by-word presented German
sentences with semantically expected or unexpected sen-
tence endings. This is a typical setup to elicit the so-called
N400; an ERP component that is associated with the vio-
lation of semantic expectancy and characterized by a
parietal negativity peaking around 400 ms after stimulus
presentation (Brandeis et al. 1995; Kutas and Hillyard
1980). Subjects were recorded twice, once at the beginning
of their stay, and once after having lived about 3 months in
Switzerland. The aim of the experiment was to track the
progress of semantic integration in the acquired foreign
language using an N400 paradigm. The measured data
Fig. 2 Flow-chart depicting the
proposed statistical testing of
the microstate models
76 Brain Topogr (2014) 27:72–83
123

Figures
Citations
More filters
Journal ArticleDOI

EEG microstates as a tool for studying the temporal dynamics of whole-brain neuronal networks: A review.

TL;DR: An overview of electrical microstates in the brain, which are defined as successive short time periods during which the configuration of the scalp potential field remains semi‐stable, suggests quasi‐simultaneity of activity among the nodes of large‐scale networks.
Journal ArticleDOI

The multisensory function of the human primary visual cortex

TL;DR: Evidence that activity within the human primary visual cortex plays an active role in multisensory processes and directly impacts behavioural outcome is reviewed, and the provocative claim of Ghazanfar and Schroeder (2006) that the whole of neocortex is mult isensory in function is considered established.
Journal ArticleDOI

A Student's Guide to Randomization Statistics for Multichannel Event-Related Potentials Using Ragu.

TL;DR: The different alternatives to apply Ragu are introduced, based on a step by step analysis of an example study that examined the neural activity in response to semantic unexpected sentence endings in exchange students at the beginning of their stay and after staying in a foreign-language country for 5 months.
Journal ArticleDOI

Data-driven region-of-interest selection without inflating Type I error rate

TL;DR: It is demonstrated, using simulations of simple ERP experiments, that data-driven ROI selection can indeed be more powerful than a priori hypotheses or independent information and it is shown that the aggregate grand average from trials (AGAT), despite being based on the data at hand, can be safely used forROI selection under many circumstances.
References
More filters
Book

Randomization, Bootstrap and Monte Carlo Methods in Biology

TL;DR: The idea of a randomization test has been explored in the context of data analysis for a long time as mentioned in this paper, and it has been applied in a variety of applications in biology, such as single species ecology and community ecology.
Journal ArticleDOI

Reading senseless sentences: brain potentials reflect semantic incongruity

TL;DR: In a sentence reading task, words that occurred out of context were associated with specific types of event-related brain potentials that elicited a late negative wave (N400).
Journal ArticleDOI

Scalp distributions of event-related potentials: An ambiguity associated with analysis of variance models

TL;DR: Using potential distributions generated by dipole sources in spherical volume conductor models, it is demonstrated that highly significant interactions involving electrode location can be obtained between scalp distributions with identical shapes generated by the same source.
Journal ArticleDOI

Reference-free identification of components of checkerboard-evoked multichannel potential fields

TL;DR: A method is proposed to determine components of evoked scalp potentials, in terms of times of occurrence (latency) and location on the scalp (topography), suggesting a stable localization of the generating process in depth.
Related Papers (5)
Frequently Asked Questions (9)
Q1. What are the contributions mentioned in the paper "A tutorial on data-driven methods for statistically assessing erp topographies" ?

The authors therefore propose a randomizationbased procedure that works without assigning grand-mean microstate prototypes to individual data. In addition, the authors propose a new criterion to select the optimal number of microstate prototypes based on cross-validation across subjects. The authors conclude that the proposed method is well-suited for the assessment of timing differences in cognitive processes. 

The literature suggests that for a reliable rejection of the null-hypothesis on a 5 % level, 1,000 randomization runs are necessary, and for an estimate at the 1 % level, 5,000randomization runs are recommended (Manly 2007). 

The full topographic information of the map is being reduced to a labeling, which as a consequence reduces the comparisons among maps from a continuous and parametric range of similarity or dissimilarity to a binary statement of same or different. 

Their proposal is that in ERP microstate models, the generalizability of the model can be assessed by testing it’s consistency across subjects; the parts of the data that can be observed independently of the individual subjects belong to the optimal microstate model, while those parts of the data that depend on the individual subjects should not be part of the model. 

The fact that in the sample analysis, the individual assignment method failed to identify the interactions may be explained by this problem, especiallyalso because these effects occurred in periods of relatively low GFP, where the SNR is typically lower, and common map features are more likely to be obscured by noise. 

Another approach to multichannel ERP analyses are various kinds of data driven spatio-temporal factor analyses, such as principal component analysis (PCA), independent component analysis (ICA), or as discussed in more detail below, cluster analysis. 

For the topographic analysis, a generalized measure of map differences was used (Koenig et al. 2011), the GFP analysis employed the difference of GFP of the same maps. 

This is done by simple rank statistics, and the probability of the data being compatible with the null hypothesis is defined by the proportion of quantifiers obtained under the null-hypothesis that were larger or equal to the quantifier obtained in the real data. 

The probability p that the observed difference is compatible with the null hypothesis is then 7/500 = 0.014, which would (given an alpha-level of 0.05) indicate that it is significant.