Linear Modeling of Neurophysiological Responses to Naturalistic Stimuli: Methodological Considerations for Applied Research

doi:10.31234/OSF.IO/JBZ2W

Linear Modeling of Neurophysiological Responses to

Naturalistic Stimuli: Methodological Considerations for

Applied Research

Michael J. Crosse

1,2,

*

, Nathaniel J. Zuk

3,*

, Giovanni M. Di Liberto

4,5

, Aaron R. Nidiffer

6

, Sophie Molholm

2

,

and Edmund C. Lalor

6,†

1

X, The Moonshot Factory, Mountain View, CA

2

Department of Pediatrics and Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY

3

Edmond & Lily Safra Center for Brain Sciences, Hebrew University, Jerusalem, Israel

4

Trinity Centre for Biomedical Engineering, Trinity College Institute of Neuroscience, Dept of Mechanical, Manufacturing and

Biomedical Engineering, Trinity College, The University of Dublin, Ireland

5

School of Electrical and Electronic Engineering and UCD Centre for Biomedical Engineering, University College Dublin, Ireland

6

Department of Biomedical Engineering and Department of Neuroscience, University of Rochester, Rochester, NY

Abstract

Cognitive neuroscience has seen an increase in the use of linear modelling techniques for studying the

processing of natural, environmental stimuli. The availability of such computational tools has prompted

similar investigations in many clinical domains, facilitating the study of cognitive and sensory deficits

within an ecologically relevant context. However, studying clinical (and often highly-heterogeneous)

cohorts introduces an added layer of complexity to such modelling procedures, leading to an increased

risk of improper usage of such techniques and, as a result, inconsistent conclusions. Here, we outline some

key methodological considerations for applied research and include worked examples of both simulated

and empirical electrophysiological (EEG) data. In particular, we focus on experimental design, data

preprocessing and stimulus feature extraction, model design, training and evaluation, and interpretation

of model weights. Throughout the paper, we demonstrate how to implement each stage in MATLAB using

the mTRF-Toolbox and discuss how to address issues that could arise in applied cognitive neuroscience

research. In doing so, we highlight the importance of understanding these more technical points for

experimental design and data analysis, and provide a resource for applied and clinical researchers

investigating sensory and cognitive processing using ecologically-rich stimuli.

Keywords: temporal response function, TRF, neural encoding, neural decoding, clinical and translational

neurophysiology, electrophysiology, EEG.

*

These authors contributed equally to this work.

†

e-mail: edmund_lalor@urmc.rochester.edu (E.C.L.)

Introduction

A core focus of cognitive neuroscience is to identify neural correlates of human behavior, with the

intention of understanding cognitive and sensory processing. Such correlates can be used to explicitly

model the functional relationship between some “real world” parameters describing a stimulus or

person’s behavior and the related brain activity. In particular, linear modelling techniques have become

ubiquitous within cognitive neuroscience because they provide a means of studying the processing of

dynamic sensory inputs such as natural scenes and sounds (Wu et al., 2006; Holdgraf et al., 2017). Unlike

event-related potentials (ERPs) – which are a direct measurement of the average neural response to a

discrete event – linear models seek to capture how changes in a stimulus dimension or cognitive state are

linearly reflected in the recorded brain activity. In other words, we model the outputs as a linear

combination (i.e., weighted sum) of the inputs. This enables researchers to conduct experiments using

ecologically relevant stimuli that are more engaging and more representative of real-world scenarios. This

contrasts with current standard practices in which discrete stimuli are presented repeatedly in a highly

artificial manner. Moreover, the simplicity of linear models enables researchers to interpret the model

weights neurophysiologically, providing insight into the neural encoding process of naturalistic stimuli

(Haufe et al., 2014; Kriegeskorte and Douglas, 2019).

The uptake in linear modelling techniques in cognitive neuroscience has led to a similar adoption in the

applied and translational neurosciences. This has greatly facilitated the study of naturalistic sensory

processing in various clinical cohorts such as individuals with autism spectrum disorder (Frey et al., 2013)

and dyslexia (Power et al., 2013; Di Liberto et al., 2018b). However, studying clinical cohorts raises

important issues when constructing and interpreting linear models. For example, particular care is

required when performing group comparisons of the model weights and evaluating model performance.

Furthermore, linear modeling poses challenges and considerations that are not typical for other types of

electrophysiology analysis. As a model, it is meant first and foremost to quantify the functional

relationship between the stimulus features of interest and the recorded neural response. Modeling

electrophysiological data is non-trivial because neighboring time samples and channels are not

independent of each other, so standard methods for quantifying the significance of the fit cannot be used.

Furthermore, the interpretation of the results must take into careful consideration the particular

preprocessing steps applied, which can have major effects on the response patterns obtained with linear

modeling, especially with respect to filtering, normalization and stimulus representation (Holdgraf et al.,

2017; de Cheveigné and Nelken, 2019). Here, we wish to provide guidance and intuition on such

procedures and, in particular, to promote best practices in applying these methods in clinical studies.

In this review, we will step through the stages involved in designing and implementing neuroscientific

experiments with linear modeling in mind. First, we discuss experimental design considerations for

optimizing model performance. Second, we discuss data preprocessing and stimulus feature extraction

techniques relevant to linear modeling. Third, we discuss model design choices and their use cases.

Fourth, we review how to appropriately train and test models as well as evaluate the significance of model

performance. Fifth, we discuss considerations for comparing models generated using multiple stimulus

representations. Sixth, we discuss the neurophysiological interpretation of linear model weights. Finally,

we discuss what can go wrong when using linear models for applied neurophysiology research.

In each section, via an example experiment, we will also introduce issues that are relevant to clinical

research. Because linear modeling is commonly used to study the neural processing of natural speech (for

reviews, see Ding and Simon, 2014; Holdgraf et al., 2017; Obleser and Kayser, 2019), these examples are

based on a speech study previously conducted by some of the authors, but the methods we describe

generalize to many other clinical groups, paradigms, and stimulus types. The researcher should modify

the experimental design, preprocessing and model design steps according to their own research

questions. Likewise, our focus will be on the linear modeling of EEG data, but these methods can be

applied to other neurophysiological data types, such as MEG, ECoG and fMRI. When discussing model

implementation, we will make specific reference to the mTRF-Toolbox, which can be found on github

(https://github.com/mickcrosse/mTRF-Toolbox). All functions referenced in this article were from version

3.0. While we do not elaborate on the technical details of the mTRF-Toolbox (for that we point the reader

to Crosse et al. (2016a)), we do provide example code and briefly walk the reader through its

implementation.

Example Experiment

The example experiment we will describe is based on a previous study performed by some of the co-

authors in this review (Di Liberto et al., 2018b). Individuals with dyslexia (our clinical group) display a

specific behavioral deficit in the processing of speech sounds (i.e., a phonological deficit), while having

intact general acoustic processing (Vellutino et al., 2004; Di Liberto et al., 2018b). We hypothesize that

observed phonological deficits can be explained by weaker phonetic encoding.

To test our hypothesis, we plan to measure how well phonetic features are represented in the ongoing

brain activity of participants with dyslexia compared to a control group. More specifically, we will quantify

how much a model that represents phonetic features improves the ability to predict EEG data over a

model based on acoustic features alone (i.e., the spectrogram). We hypothesize that the predictive

contribution from the phonetic model is reduced in participants with dyslexia, reflective of impaired

neural tracking of phonetic features, while the contributions of acoustics are comparable between groups.

To be clear, while it is inspired by a real study, the example experiment we discuss in this paper is merely

a toy experiment for didactic purposes.

Experimental design

One of the benefits of employing linear models for EEG analysis is the ability to use dynamic and

naturalistic stimuli. Additionally, it allows the experimenter to study sensory processing in an ecologically-

relevant context, and it also provides researchers the opportunity to design experiments that are more

engaging for the participants. This can potentially improve the quality of the data collected as well as the

reliability of the researcher’s findings. Certain factors should be considered when designing naturalistic

experiments.

Use subject-relevant stimulus material. This is primarily relevant to speech studies and is important for

ensuring subject compliance with the task, particularly when studying younger cohorts and individuals

with neurological disorders or developmental disabilities. For example, it is important when choosing an

audiobook or movie, that it is 1) age-relevant (e.g., a children’s story versus an adult’s podcast), 2) content-

relevant (a quantum physics lecture may not be everyone’s cup of tea), and 3) language-relevant (speaker

dialect and even accent may impact early-stage processing across participants/groups differentially). It

may in some situations be necessary to create such content from scratch by recording a native speaker

reading the chosen material aloud. However, there are also publicly available stimulus databases such as

MUSAN: an annotated corpus of continuous speech, music and noise (Snyder et al., 2015), and TCD-TIMIT:

a phonetically rich corpus of continuous audiovisual speech (Harte and Gillen, 2015).

Use a well-balanced stimulus set. It is important to consider the frequency of occurrence of particular

stimulus features that are relevant to the study (e.g., spectral or phonetic features). For example, choosing

stimulus material that contains only a few instances of particular phonemes will make it difficult to reliably

model the neural response to such phonemes without overfitting to the noise on those examples. This

can be avoided by employing phonetically balanced stimuli, such as the aforementioned TCD-TIMIT corpus

(Harte and Gillen, 2015), or in a post hoc manner by focusing the analysis on a subset of the data, i.e., only

the features that are equally represented or only the time segments where the stimuli are well balanced.

It is also best to work with longer stimuli that are preferably broadband or quasi-periodic (e.g., speech or

music recordings). Linear modeling can produce ambiguous results if the stimulus is perfectly periodic

since periodicity can result in artificially periodic-looking evoked responses which can also increase

difficulties with quantifying the accuracy of the model.

In addition, to enhance the model’s ability to disambiguate these response types and better generalize to

novel stimuli, one might consider how to incorporate additional acoustic variability in one’s stimuli,

independent of the linguistic content. This could be accomplished by including multiple speakers with

substantially different spectral profiles (e.g., both male and female speakers), as well as speakers who

provide a more dynamic range in prosody and intonation across the speech content (e.g., trained actors

or media presenters). Models that are trained on a broader range of stimuli are less likely to overfit to

stimulus features that are not of interest to the researcher (such as speaker identity, sex, or location), but

may perform slightly worse on average. Such decisions should be based on the researcher’s overall goals.

When considering your stimuli, we also suggest adopting an open mind with respect to possible future

analyses. Choosing materials that are rich in other features that can be modeled (e.g., semantic content,

prosody, temporal statistics) can provide fruitful opportunities for re-using your data to tackle new

questions beyond those planned in your current study (fans of Dr. Seuss and James Joyce beware!).

Collect enough training data. In order to train a model that generalizes well to new data, it is crucial to

consider how much training data is required, or in other words, how much new stimulus material it is

necessary to have. For most purposes, we recommend collecting a minimum of 10 to 20 minutes of data

per condition, although more data may be required for larger, multivariate models (e.g., spectrogram

models) or when features are sparsely represented (e.g., the onsets of content words). While it is feasible

to construct high-quality models from many short (<5 s) stimulus sequences, such as individual words or

sentences, it is preferable to use longer (>30 s) stimulus passages because it reduces the number of large

stimulus onset responses in the neural data, which tend to obscure feature-specific responses of interest

(see EEG preprocessing for tips on avoiding this).

While more data is always desirable for model training, longer recording sessions can cause subject

fatigue, compromising their ability to concentrate, particularly in children, older adults, or clinical cohorts.

Reduced attentional states can negatively impact the neural tracking of stimuli and as a result model

Linear Modeling of Neurophysiological Responses to Naturalistic Stimuli: Methodological Considerations for Applied Research

Figures

Citations

Neural tracking in infants - An analytical tool for multisensory social processing in development.

Masking of the mouth area impairs reconstruction of acoustic speech features and higher-level segmentational features in the presence of a distractor speaker

Seeing a talking face matters: The relationship between cortical tracking of continuous auditory‐visual speech and gaze behaviour in infants, children and adults

Occlusion of lip movements impairs reconstruction of acoustic speech features and higher-level segmentational features in the presence of a distractor speaker

Predictors for Estimating Subcortical EEG Responses to Continuous Speech

References

Independent component analysis: algorithms and applications

Nonparametric statistical testing of EEG- and MEG-data

Nonparametric permutation tests for functional neuroimaging: A primer with examples

The cortical organization of speech processing

Specific reading disability (dyslexia): what have we learned in the past four decades?

Related Papers (5)

Patient treatment matching: a conceptual and methodological review with suggestions for future research.

Dyadic communication, verbal behavior, thinking, and understanding. 3. Clinical considerations.

Introducing a series of methodological articles on considering complexity in systematic reviews of interventions.

An Emotional Appeal for the Development of Empirical Research on Narrative.

[The conceptual model: a social survey tool].