scispace - formally typeset
Open AccessPosted ContentDOI

Linear Modeling of Neurophysiological Responses to Naturalistic Stimuli: Methodological Considerations for Applied Research

Reads0
Chats0
TLDR
In this article, the authors focus on experimental design, data preprocessing and stimulus feature extraction, model design, training and evaluation, and interpretation of model weights, and demonstrate how to implement each stage in MATLAB using the mTRF toolbox.
Abstract
Cognitive neuroscience has seen an increase in the use of linear modelling techniques for studying the processing of natural, environmental stimuli. The availability of such computational tools has prompted similar investigations in many clinical domains, facilitating the study of cognitive and sensory deficits within an ecologically relevant context. However, studying clinical (and often highly-heterogeneous) cohorts introduces an added layer of complexity to such modelling procedures, leading to an increased risk of improper usage of such techniques and, as a result, inconsistent conclusions. Here, we outline some key methodological considerations for applied research and include worked examples of both simulated and empirical electrophysiological (EEG) data. In particular, we focus on experimental design, data preprocessing and stimulus feature extraction, model design, training and evaluation, and interpretation of model weights. Throughout the paper, we demonstrate how to implement each stage in MATLAB using the mTRF-Toolbox and discuss how to address issues that could arise in applied cognitive neuroscience research. In doing so, we highlight the importance of understanding these more technical points for experimental design and data analysis, and provide a resource for applied and clinical researchers investigating sensory and cognitive processing using ecologically-rich stimuli.

read more

Content maybe subject to copyright    Report

Linear Modeling of Neurophysiological Responses to
Naturalistic Stimuli: Methodological Considerations for
Applied Research
Michael J. Crosse
1,2,
*
, Nathaniel J. Zuk
3,*
, Giovanni M. Di Liberto
4,5
, Aaron R. Nidiffer
6
, Sophie Molholm
2
,
and Edmund C. Lalor
6,†
1
X, The Moonshot Factory, Mountain View, CA
2
Department of Pediatrics and Department of Neuroscience, Albert Einstein College of Medicine, Bronx, NY
3
Edmond & Lily Safra Center for Brain Sciences, Hebrew University, Jerusalem, Israel
4
Trinity Centre for Biomedical Engineering, Trinity College Institute of Neuroscience, Dept of Mechanical, Manufacturing and
Biomedical Engineering, Trinity College, The University of Dublin, Ireland
5
School of Electrical and Electronic Engineering and UCD Centre for Biomedical Engineering, University College Dublin, Ireland
6
Department of Biomedical Engineering and Department of Neuroscience, University of Rochester, Rochester, NY
Abstract
Cognitive neuroscience has seen an increase in the use of linear modelling techniques for studying the
processing of natural, environmental stimuli. The availability of such computational tools has prompted
similar investigations in many clinical domains, facilitating the study of cognitive and sensory deficits
within an ecologically relevant context. However, studying clinical (and often highly-heterogeneous)
cohorts introduces an added layer of complexity to such modelling procedures, leading to an increased
risk of improper usage of such techniques and, as a result, inconsistent conclusions. Here, we outline some
key methodological considerations for applied research and include worked examples of both simulated
and empirical electrophysiological (EEG) data. In particular, we focus on experimental design, data
preprocessing and stimulus feature extraction, model design, training and evaluation, and interpretation
of model weights. Throughout the paper, we demonstrate how to implement each stage in MATLAB using
the mTRF-Toolbox and discuss how to address issues that could arise in applied cognitive neuroscience
research. In doing so, we highlight the importance of understanding these more technical points for
experimental design and data analysis, and provide a resource for applied and clinical researchers
investigating sensory and cognitive processing using ecologically-rich stimuli.
Keywords: temporal response function, TRF, neural encoding, neural decoding, clinical and translational
neurophysiology, electrophysiology, EEG.
*
These authors contributed equally to this work.
e-mail: edmund_lalor@urmc.rochester.edu (E.C.L.)

Introduction
A core focus of cognitive neuroscience is to identify neural correlates of human behavior, with the
intention of understanding cognitive and sensory processing. Such correlates can be used to explicitly
model the functional relationship between some “real world” parameters describing a stimulus or
person’s behavior and the related brain activity. In particular, linear modelling techniques have become
ubiquitous within cognitive neuroscience because they provide a means of studying the processing of
dynamic sensory inputs such as natural scenes and sounds (Wu et al., 2006; Holdgraf et al., 2017). Unlike
event-related potentials (ERPs) which are a direct measurement of the average neural response to a
discrete event linear models seek to capture how changes in a stimulus dimension or cognitive state are
linearly reflected in the recorded brain activity. In other words, we model the outputs as a linear
combination (i.e., weighted sum) of the inputs. This enables researchers to conduct experiments using
ecologically relevant stimuli that are more engaging and more representative of real-world scenarios. This
contrasts with current standard practices in which discrete stimuli are presented repeatedly in a highly
artificial manner. Moreover, the simplicity of linear models enables researchers to interpret the model
weights neurophysiologically, providing insight into the neural encoding process of naturalistic stimuli
(Haufe et al., 2014; Kriegeskorte and Douglas, 2019).
The uptake in linear modelling techniques in cognitive neuroscience has led to a similar adoption in the
applied and translational neurosciences. This has greatly facilitated the study of naturalistic sensory
processing in various clinical cohorts such as individuals with autism spectrum disorder (Frey et al., 2013)
and dyslexia (Power et al., 2013; Di Liberto et al., 2018b). However, studying clinical cohorts raises
important issues when constructing and interpreting linear models. For example, particular care is
required when performing group comparisons of the model weights and evaluating model performance.
Furthermore, linear modeling poses challenges and considerations that are not typical for other types of
electrophysiology analysis. As a model, it is meant first and foremost to quantify the functional
relationship between the stimulus features of interest and the recorded neural response. Modeling
electrophysiological data is non-trivial because neighboring time samples and channels are not
independent of each other, so standard methods for quantifying the significance of the fit cannot be used.
Furthermore, the interpretation of the results must take into careful consideration the particular
preprocessing steps applied, which can have major effects on the response patterns obtained with linear
modeling, especially with respect to filtering, normalization and stimulus representation (Holdgraf et al.,

2017; de Cheveigné and Nelken, 2019). Here, we wish to provide guidance and intuition on such
procedures and, in particular, to promote best practices in applying these methods in clinical studies.
In this review, we will step through the stages involved in designing and implementing neuroscientific
experiments with linear modeling in mind. First, we discuss experimental design considerations for
optimizing model performance. Second, we discuss data preprocessing and stimulus feature extraction
techniques relevant to linear modeling. Third, we discuss model design choices and their use cases.
Fourth, we review how to appropriately train and test models as well as evaluate the significance of model
performance. Fifth, we discuss considerations for comparing models generated using multiple stimulus
representations. Sixth, we discuss the neurophysiological interpretation of linear model weights. Finally,
we discuss what can go wrong when using linear models for applied neurophysiology research.
In each section, via an example experiment, we will also introduce issues that are relevant to clinical
research. Because linear modeling is commonly used to study the neural processing of natural speech (for
reviews, see Ding and Simon, 2014; Holdgraf et al., 2017; Obleser and Kayser, 2019), these examples are
based on a speech study previously conducted by some of the authors, but the methods we describe
generalize to many other clinical groups, paradigms, and stimulus types. The researcher should modify
the experimental design, preprocessing and model design steps according to their own research
questions. Likewise, our focus will be on the linear modeling of EEG data, but these methods can be
applied to other neurophysiological data types, such as MEG, ECoG and fMRI. When discussing model
implementation, we will make specific reference to the mTRF-Toolbox, which can be found on github
(https://github.com/mickcrosse/mTRF-Toolbox). All functions referenced in this article were from version
3.0. While we do not elaborate on the technical details of the mTRF-Toolbox (for that we point the reader
to Crosse et al. (2016a)), we do provide example code and briefly walk the reader through its
implementation.
Example Experiment
The example experiment we will describe is based on a previous study performed by some of the co-
authors in this review (Di Liberto et al., 2018b). Individuals with dyslexia (our clinical group) display a
specific behavioral deficit in the processing of speech sounds (i.e., a phonological deficit), while having
intact general acoustic processing (Vellutino et al., 2004; Di Liberto et al., 2018b). We hypothesize that
observed phonological deficits can be explained by weaker phonetic encoding.

To test our hypothesis, we plan to measure how well phonetic features are represented in the ongoing
brain activity of participants with dyslexia compared to a control group. More specifically, we will quantify
how much a model that represents phonetic features improves the ability to predict EEG data over a
model based on acoustic features alone (i.e., the spectrogram). We hypothesize that the predictive
contribution from the phonetic model is reduced in participants with dyslexia, reflective of impaired
neural tracking of phonetic features, while the contributions of acoustics are comparable between groups.
To be clear, while it is inspired by a real study, the example experiment we discuss in this paper is merely
a toy experiment for didactic purposes.
Experimental design
One of the benefits of employing linear models for EEG analysis is the ability to use dynamic and
naturalistic stimuli. Additionally, it allows the experimenter to study sensory processing in an ecologically-
relevant context, and it also provides researchers the opportunity to design experiments that are more
engaging for the participants. This can potentially improve the quality of the data collected as well as the
reliability of the researcher’s findings. Certain factors should be considered when designing naturalistic
experiments.
Use subject-relevant stimulus material. This is primarily relevant to speech studies and is important for
ensuring subject compliance with the task, particularly when studying younger cohorts and individuals
with neurological disorders or developmental disabilities. For example, it is important when choosing an
audiobook or movie, that it is 1) age-relevant (e.g., a children’s story versus an adult’s podcast), 2) content-
relevant (a quantum physics lecture may not be everyone’s cup of tea), and 3) language-relevant (speaker
dialect and even accent may impact early-stage processing across participants/groups differentially). It
may in some situations be necessary to create such content from scratch by recording a native speaker
reading the chosen material aloud. However, there are also publicly available stimulus databases such as
MUSAN: an annotated corpus of continuous speech, music and noise (Snyder et al., 2015), and TCD-TIMIT:
a phonetically rich corpus of continuous audiovisual speech (Harte and Gillen, 2015).
Use a well-balanced stimulus set. It is important to consider the frequency of occurrence of particular
stimulus features that are relevant to the study (e.g., spectral or phonetic features). For example, choosing
stimulus material that contains only a few instances of particular phonemes will make it difficult to reliably
model the neural response to such phonemes without overfitting to the noise on those examples. This
can be avoided by employing phonetically balanced stimuli, such as the aforementioned TCD-TIMIT corpus

(Harte and Gillen, 2015), or in a post hoc manner by focusing the analysis on a subset of the data, i.e., only
the features that are equally represented or only the time segments where the stimuli are well balanced.
It is also best to work with longer stimuli that are preferably broadband or quasi-periodic (e.g., speech or
music recordings). Linear modeling can produce ambiguous results if the stimulus is perfectly periodic
since periodicity can result in artificially periodic-looking evoked responses which can also increase
difficulties with quantifying the accuracy of the model.
In addition, to enhance the model’s ability to disambiguate these response types and better generalize to
novel stimuli, one might consider how to incorporate additional acoustic variability in one’s stimuli,
independent of the linguistic content. This could be accomplished by including multiple speakers with
substantially different spectral profiles (e.g., both male and female speakers), as well as speakers who
provide a more dynamic range in prosody and intonation across the speech content (e.g., trained actors
or media presenters). Models that are trained on a broader range of stimuli are less likely to overfit to
stimulus features that are not of interest to the researcher (such as speaker identity, sex, or location), but
may perform slightly worse on average. Such decisions should be based on the researcher’s overall goals.
When considering your stimuli, we also suggest adopting an open mind with respect to possible future
analyses. Choosing materials that are rich in other features that can be modeled (e.g., semantic content,
prosody, temporal statistics) can provide fruitful opportunities for re-using your data to tackle new
questions beyond those planned in your current study (fans of Dr. Seuss and James Joyce beware!).
Collect enough training data. In order to train a model that generalizes well to new data, it is crucial to
consider how much training data is required, or in other words, how much new stimulus material it is
necessary to have. For most purposes, we recommend collecting a minimum of 10 to 20 minutes of data
per condition, although more data may be required for larger, multivariate models (e.g., spectrogram
models) or when features are sparsely represented (e.g., the onsets of content words). While it is feasible
to construct high-quality models from many short (<5 s) stimulus sequences, such as individual words or
sentences, it is preferable to use longer (>30 s) stimulus passages because it reduces the number of large
stimulus onset responses in the neural data, which tend to obscure feature-specific responses of interest
(see EEG preprocessing for tips on avoiding this).
While more data is always desirable for model training, longer recording sessions can cause subject
fatigue, compromising their ability to concentrate, particularly in children, older adults, or clinical cohorts.
Reduced attentional states can negatively impact the neural tracking of stimuli and as a result model

Figures
Citations
More filters
Posted ContentDOI

Neural tracking in infants - An analytical tool for multisensory social processing in development.

TL;DR: It is argued that neural tracking provides a promising way to investigate early (social) processing in an ecologically valid setting by using linear models to investigate neural tracking responses in electroencephalographic (EEG) data.
Journal ArticleDOI

Masking of the mouth area impairs reconstruction of acoustic speech features and higher-level segmentational features in the presence of a distractor speaker

TL;DR: In this article , the authors conducted an audiovisual (AV) multi-speaker experiment using naturalistic speech and found significant main effects of face masks on the reconstruction of acoustic features, such as the speech envelope and spectral speech features.
Journal ArticleDOI

Seeing a talking face matters: The relationship between cortical tracking of continuous auditory‐visual speech and gaze behaviour in infants, children and adults

TL;DR: In this article , EEG and eye-tracking data of 5-month-olds, 4-year-olds and adults as they were presented with a speaker in auditory-visual (AO), visual-only (VO), and auditoryvisual (AV) modes were collected.
Posted ContentDOI

Occlusion of lip movements impairs reconstruction of acoustic speech features and higher-level segmentational features in the presence of a distractor speaker

TL;DR: In this article, the authors conducted an audiovisual (AV) multi-speaker experiment using naturalistic speech and found significant main effects of face masks on the reconstruction of acoustic features, such as the speech envelope and spectral speech features, while reconstruction of higher level features of speech segmentation (phoneme and word onsets) were especially impaired through masks in difficult listening situations.
Posted ContentDOI

Predictors for Estimating Subcortical EEG Responses to Continuous Speech

TL;DR: In this paper , the authors compared predictors from both simple and complex auditory models for estimating brainstem TRFs on electroencephalography (EEG) data from 24 subjects listening to continuous speech.
References
More filters
Journal ArticleDOI

Independent component analysis: algorithms and applications

TL;DR: The basic theory and applications of ICA are presented, and the goal is to find a linear representation of non-Gaussian data so that the components are statistically independent, or as independent as possible.
Journal ArticleDOI

Nonparametric statistical testing of EEG- and MEG-data

TL;DR: This paper forms a null hypothesis and shows that the nonparametric test controls the false alarm rate under this null hypothesis, enabling neuroscientists to construct their own statistical test, maximizing the sensitivity to the expected effect.
Journal ArticleDOI

Nonparametric permutation tests for functional neuroimaging: A primer with examples

TL;DR: The standard nonparametric randomization and permutation testing ideas are developed at an accessible level, using practical examples from functional neuroimaging, and the extensions for multiple comparisons described.
Journal ArticleDOI

The cortical organization of speech processing

TL;DR: A dual-stream model of speech processing is outlined that assumes that the ventral stream is largely bilaterally organized — although there are important computational differences between the left- and right-hemisphere systems — and that the dorsal stream is strongly left- Hemisphere dominant.
Journal ArticleDOI

Specific reading disability (dyslexia): what have we learned in the past four decades?

TL;DR: Evidence is presented in support of the idea that many poor readers are impaired because of inadequate instruction or other experiential factors, and Hypothesized deficits in general learning abilities and low-level sensory deficits have weak validity as causal factors in specific reading disability.
Related Papers (5)