scispace - formally typeset
Open AccessJournal ArticleDOI

Congruent Visual Speech Enhances Cortical Entrainment to Continuous Auditory Speech in Noise-Free Conditions

Reads0
Chats0
TLDR
It is demonstrated that the brain's representation of auditory speech is enhanced when the accompanying visual speech signal shares the same timing, and this enhancement is most pronounced at a time scale that corresponds to mean syllable length.
Abstract
Congruent audiovisual speech enhances our ability to comprehend a speaker, even in noise-free conditions. When incongruent auditory and visual information is presented concurrently, it can hinder a listener9s perception and even cause him or her to perceive information that was not presented in either modality. Efforts to investigate the neural basis of these effects have often focused on the special case of discrete audiovisual syllables that are spatially and temporally congruent, with less work done on the case of natural, continuous speech. Recent electrophysiological studies have demonstrated that cortical response measures to continuous auditory speech can be easily obtained using multivariate analysis methods. Here, we apply such methods to the case of audiovisual speech and, importantly, present a novel framework for indexing multisensory integration in the context of continuous speech. Specifically, we examine how the temporal and contextual congruency of ongoing audiovisual speech affects the cortical encoding of the speech envelope in humans using electroencephalography. We demonstrate that the cortical representation of the speech envelope is enhanced by the presentation of congruent audiovisual speech in noise-free conditions. Furthermore, we show that this is likely attributable to the contribution of neural generators that are not particularly active during unimodal stimulation and that it is most prominent at the temporal scale corresponding to syllabic rate (2–6 Hz). Finally, our data suggest that neural entrainment to the speech envelope is inhibited when the auditory and visual streams are incongruent both temporally and contextually. SIGNIFICANCE STATEMENT Seeing a speaker9s face as he or she talks can greatly help in understanding what the speaker is saying. This is because the speaker9s facial movements relay information about what the speaker is saying, but also, importantly, when the speaker is saying it. Studying how the brain uses this timing relationship to combine information from continuous auditory and visual speech has traditionally been methodologically difficult. Here we introduce a new approach for doing this using relatively inexpensive and noninvasive scalp recordings. Specifically, we show that the brain9s representation of auditory speech is enhanced when the accompanying visual speech signal shares the same timing. Furthermore, we show that this enhancement is most pronounced at a time scale that corresponds to mean syllable length.

read more

Citations
More filters
Journal ArticleDOI

The Multivariate Temporal Response Function (mTRF) Toolbox: A MATLAB Toolbox for Relating Neural Signals to Continuous Stimuli

TL;DR: A new open-source toolbox for performing temporal response functions describing a mapping between stimulus and response in both directions is introduced and the importance of regularizing the analysis is explained and how this regularization can be optimized for a particular dataset.
Journal ArticleDOI

Neural Entrainment to Speech Modulates Speech Intelligibility.

TL;DR: It is implied that speech-brain entrainment is critical for auditory speech comprehension and suggested that transcranial stimulation with speech-envelope-shaped currents can be utilized to modulate speech comprehension in impaired listening conditions.
Journal ArticleDOI

Noise-robust cortical tracking of attended speech in real-world acoustic scenes.

TL;DR: Results suggest that cortical activity tracks an attended speech signal in a way that is invariant to acoustic distortions encountered in real‐life sound environments, and suggests a potential utility of stimulus reconstruction techniques in attention‐controlled brain‐computer interfaces.
Journal ArticleDOI

Decoding the auditory brain with canonical component analysis

TL;DR: An approach based on Canonical Correlation Analysis (CCA) is described that finds the optimal transform to apply to both the stimulus and the response to reveal correlations between the two, thus providing increased sensitivity to relatively small effects, and supports classifier schemes that yield higher classification scores.
References
More filters
Journal ArticleDOI

A tutorial on hidden Markov models and selected applications in speech recognition

TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.
Journal ArticleDOI

EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component analysis.

TL;DR: EELAB as mentioned in this paper is a toolbox and graphic user interface for processing collections of single-trial and/or averaged EEG data of any number of channels, including EEG data, channel and event information importing, data visualization (scrolling, scalp map and dipole model plotting, plus multi-trial ERP-image plots), preprocessing (including artifact rejection, filtering, epoch selection, and averaging), Independent Component Analysis (ICA) and time/frequency decomposition including channel and component cross-coherence supported by bootstrap statistical methods based on data resampling.
Journal ArticleDOI

Hearing lips and seeing voices

TL;DR: The study reported here demonstrates a previously unrecognised influence of vision upon speech perception, on being shown a film of a young woman's talking head in which repeated utterances of the syllable [ba] had been dubbed on to lip movements for [ga].
Journal ArticleDOI

Visual contribution to speech intelligibility in noise

TL;DR: In this article, the visual contribution to oral speech intelligibility was examined as a function of the speech-to-noise ratio and of the size of the vocabulary under test.
Book

The Merging of the Senses

TL;DR: The authors draw on their own experiments to illustrate how sensory inputs converge on individual neurons in different areas of the brain, how these neurons integrate their inputs, the principles by which this integration occurs, and what this may mean for perception and behavior.
Related Papers (5)