scispace - formally typeset
Search or ask a question

Showing papers on "Crossmodal published in 2006"


Journal ArticleDOI
TL;DR: It is contended that the multisensory speech system is maximally tuned for SNRs between extremes, where the system relies on either the visual (speech-reading) or the auditory modality alone, forming a window of maximal integration at intermediate SNR levels.
Abstract: Viewing a speaker’s articulatory movements substantially improves a listener’s ability to understand spoken words, especially under noisy environmental conditions. It has been claimed that this gain is most pronounced when auditory input is weakest, an effect that has been related to a well-known principle of multisensory integration—‘‘inverse effectiveness.’’ In keeping with the predictions of this principle, the present study showed substantial gain in multisensory speech enhancement at even the lowest signal-tonoise ratios (SNRs) used (224 dB), but it was also evident that there was a ‘‘special zone’’ at a more intermediate SNR of 212 dB where multisensory integration was additionally enhanced beyond the predictions of this principle. As such, we show that inverse effectiveness does not strictly apply to the multisensory enhancements seen during audiovisual speech perception. Rather, the gain from viewing visual articulations is maximal at intermediate SNRs, well above the lowest auditory SNR where the recognition of whole words is significantly different from zero. We contend that the multisensory speech system is maximally tuned for SNRs between extremes, where the system relies on either the visual (speechreading) or the auditory modality alone, forming a window of maximal integration at intermediate SNR levels. At these intermediate levels, the extent of multisensory enhancement of speech recognition is considerable, amounting to more than a 3-fold performance improvement relative to an auditory-alone condition.

587 citations


Journal ArticleDOI
TL;DR: This review addresses a fundamental neuroscientific question in food perception: how multimodal features of food are integrated by introducing several plausible neuroscientific models, which provide a framework for further neuroscientific exploration in this area.

340 citations


Journal ArticleDOI
TL;DR: It is demonstrated that brief exposure to ecologically valid and sensory redundant stimulus pairs, such as voices and faces, induces specific multisensory associations and is suggested that for natural objects effective predictive signals can be generated across sensory systems and proceed by optimization of functional connectivity between specialized cortical sensory modules.
Abstract: Natural objects provide partially redundant information to the brain through different sensory modalities. For example, voices and faces both give information about the speech content, age, and gender of a person. Thanks to this redundancy, multimodal recognition is fast, robust, and automatic. In unimodal perception, however, only part of the information about an object is available. Here, we addressed whether, even under conditions of unimodal sensory input, crossmodal neural circuits that have been shaped by previous associative learning become activated and underpin a performance benefit. We measured brain activity with functional magnetic resonance imaging before, while, and after participants learned to associate either sensory redundant stimuli, i.e. voices and faces, or arbitrary multimodal combinations, i.e. voices and written names, ring tones, and cell phones or brand names of these cell phones. After learning, participants were better at recognizing unimodal auditory voices that had been paired with faces than those paired with written names, and association of voices with faces resulted in an increased functional coupling between voice and face areas. No such effects were observed for ring tones that had been paired with cell phones or names. These findings demonstrate that brief exposure to ecologically valid and sensory redundant stimulus pairs, such as voices and faces, induces specific multisensory associations. Consistent with predictive coding theories, associative representations become thereafter available for unimodal perception and facilitate object recognition. These data suggest that for natural objects effective predictive signals can be generated across sensory systems and proceed by optimization of functional connectivity between specialized cortical sensory modules.

311 citations


Journal ArticleDOI
TL;DR: These results integrate nonhuman and human primate research by providing converging evidence that human perirhinal cortex is also critically involved in processing meaningful aspects of multimodal object representations.
Abstract: Knowledge of objects in the world is stored in our brains as rich, multimodal representations. Because the neural pathways that process this diverse sensory information are largely anatomically distinct, a fundamental challenge to cognitive neuroscience is to explain how the brain binds the different sensory features that comprise an object to form meaningful, multimodal object representations. Studies with nonhuman primates suggest that a structure at the culmination of the object recognition system (the perirhinal cortex) performs this critical function. In contrast, human neuroimaging studies implicate the posterior superior temporal sulcus (pSTS). The results of the functional MRI study reported here resolve this apparent discrepancy by demonstrating that both pSTS and the perirhinal cortex contribute to crossmodal binding in humans, but in different ways. Significantly, only perirhinal cortex activity is modulated by meaning variables (e.g., semantic congruency and semantic category), suggesting that these two regions play complementary functional roles, with pSTS acting as a presemantic, heteromodal region for crossmodal perceptual features, and perirhinal cortex integrating these features into higher-level conceptual representations. This interpretation is supported by the results of our behavioral study: Patients with lesions, including the perirhinal cortex, but not patients with damage restricted to frontal cortex, were impaired on the same crossmodal integration task, and their performance was significantly influenced by the same semantic factors, mirroring the functional MRI findings. These results integrate nonhuman and human primate research by providing converging evidence that human perirhinal cortex is also critically involved in processing meaningful aspects of multimodal object representations.

252 citations


Journal ArticleDOI
TL;DR: The results show that multisensory interactions can be exploited to yield more efficient learning of sensory information and suggest that mult isensory training programs would be most effective for the acquisition of new skills.

216 citations


Journal ArticleDOI
TL;DR: Only the two youngest groups exhibited intersensory matching, indicating that perceptual narrowing is pan-s Sensory and a fundamental feature of perceptual development.
Abstract: Between 6 and 10 months of age, infants become better at discriminating among native voices and human faces and worse at discriminating among nonnative voices and other species’ faces. We tested whether these unisensory perceptual narrowing effects reflect a general ontogenetic feature of perceptual systems by testing across sensory modalities. We showed pairs of monkey faces producing two different vocalizations to 4-, 6-, 8-, and 10-month-old infants and asked whether they would prefer to look at the corresponding face when they heard one of the two vocalizations. Only the two youngest groups exhibited intersensory matching, indicating that perceptual narrowing is pan-sensory and a fundamental feature of perceptual development.

157 citations


Journal ArticleDOI
TL;DR: Significant stronger activations were found in the mid‐portion of the right fusiform gyrus during judgment of facial expressions in presence of fearful as compared to happy intonations, indicating that enhanced processing of faces within this region can be induced by the presence of threat‐related information perceived via the auditory modality.
Abstract: Emotional information can be conveyed by various means of communication, such as propositional content, speech intonation, facial expression, and gestures Prior studies have demonstrated that inputs from one modality can alter perception in another modality To evaluate the impact of emotional intonation on ratings of emotional faces, a behavioral study first was carried out Second, functional magnetic resonance (fMRI) was used to identify brain regions that mediate crossmodal effects of emotional prosody on judgments of facial expressions In the behavioral study, subjects rated fearful and neutral facial expressions as being more fearful when accompanied by a fearful voice as compared to the same facial expressions without concomitant auditory stimulus, whereas no such influence on rating of faces was found for happy voices In the fMRI experiment, this shift in rating of facial expressions in presence of a fearfully spoken sentence was correlated with the hemodynamic response in the left amygdala extending into the periamygdaloid cortex, which suggests that crossmodal effects on cognitive judgments of emotional information are mediated via these neuronal structures Furthermore, significantly stronger activations were found in the mid-portion of the right fusiform gyrus during judgment of facial expressions in presence of fearful as compared to happy intonations, indicating that enhanced processing of faces within this region can be induced by the presence of threat-related information perceived via the auditory modality Presumably, these increased extrastriate activations correspond to enhanced alertness, whereas responses within the left amygdala modulate cognitive evaluation of emotional facial expressions

136 citations


Journal Article
TL;DR: The results of a number of studies show that the modulation of the auditory cues elicited by our contact or interaction with different surfaces (such as abrasive sandpapers or even our own skin) and products can dramatically change the way in which they are perceived, despite the fact that we are often unaware of the influence of such auditory cues on our perception.
Abstract: The sounds that are elicited when we touch or use many everyday objects typically convey potentially useful information regarding the nature of the stimuli with which we are interacting. Here we review the rapidly-growing literature demonstrating the influence of auditory cues (such as overall sound level and the spectral distribution of the sounds) on multisensory product perception. The results of a number of studies now show that the modulation of the auditory cues elicited by our contact or interaction with different surfaces (such as abrasive sandpapers or even our own skin) and products (including electric toothbrushes, aerosol sprays, food mixers, and cars) can dramatically change the way in which they are perceived, despite the fact that we are often unaware of the influence of such auditory cues on our perception. The auditory cues generated by products can also be modified in order to change people's perception of the quality/efficiency of those products. The principles of sound design have also been used recently to alter people's perception of a variety of foodstuffs. Findings such as these demonstrate the automatic and obligatory nature of multisensory integration, and show how the cues available in one sensory modality can modulate people's perception of stimuli in other sensory modalities (despite the fact that they may not be aware of the importance of such crossmodal influences). We also highlight evidence showing that auditory cues can influence product perception at a more semantic level, as demonstrated by research on signature sounds and emotional product sound design.

133 citations


Book ChapterDOI
TL;DR: Consistently with behavioral and neuroimaging data showing that chromatic-graphemic (colored-letter) synesthesia is a genuine perceptual phenomenon implicating extrastriate cortex, electrophysiological data is presented showing modulation of visual evoked potentials by synesthetic color congruency.
Abstract: Synesthesia is a condition in which stimulation in one modality also gives rise to a perceptual experience in a second modality. In two recent studies we found that the condition is more common than previously reported; up to 5% of the population may experience at least one type of synesthesia. Although the condition has been traditionally viewed as an anomaly (e.g., breakdown in modularity), it seems that at least some of the mechanisms underlying synesthesia do reflect universal crossmodal mechanisms. We review here a number of examples of crossmodal correspondences found in both synesthetes and nonsynesthetes including pitch-lightness and vision-touch interaction, as well as cross-domain spatial-numeric interactions. Additionally, we discuss the common role of spatial attention in binding shape and color surface features (whether ordinary or synesthetic color). Consistently with behavioral and neuroimaging data showing that chromatic–graphemic (colored-letter) synesthesia is a genuine perceptual phenomenon implicating extrastriate cortex, we also present electrophysiological data showing modulation of visual evoked potentials by synesthetic color congruency.

128 citations


Journal ArticleDOI
TL;DR: 7-month-old infants' processing of emotionally congruent and incongruent face-voice pairs using ERP measures suggest that 7- month-olds integrate emotional information across modalities and recognize common affect in the face and voice.
Abstract: We examined 7-month-old infants’ processing of emotionally congruent and incongruent face‐voice pairs using ERP measures. Infants watched facial expressions (happy or angry) and, after a delay of 400 ms, heard a word spoken with a prosody that was either emotionally congruent or incongruent with the face being presented. The ERP data revealed that the amplitude of a negative component and a subsequent positive component in infants’ ERPs varied as a function of crossmodal emotional congruity. An emotionally incongruent prosody elicited a larger negative component in infants’ ERPs than did an emotionally congruent prosody. Conversely, the amplitude of infants’ positive component was larger to emotionally congruent than to incongruent prosody. Previous work has shown that an attenuation of the negative component and an enhancement of the later positive component in infants’ ERPs reflect the recognition of an item. Thus, the current findings suggest that 7-month-olds integrate emotional information across modalities and recognize common affect in the face and voice.

126 citations


Journal ArticleDOI
TL;DR: Investigating IB for words within and across sensory modalities in order to assess whether dividing attention across different senses has the same consequences as dividing attention within an individual sensory modality found it to be less prevalent when attention is divided across modalities than within the same modality.
Abstract: People often fail to consciously perceive visual events that are outside the focus of attention, a phenomenon referred to as inattentional blindness or IB (i.e., Mack & Rock, 1998). Here, we investigated IB for words within and across sensory modalities (visually and auditorily) in order to assess whether dividing attention across different senses has the same consequences as dividing attention within an individual sensory modality. Participants were asked to monitor a rapid stream of pictures or sounds presented concurrently with task-irrelevant words (spoken or written). A word recognition test was used to measure the processing for unattended words compared to word recognition levels after explicitly monitoring the word stream. We were able to produce high levels of IB for visually and auditorily presented words under unimodal conditions (Experiment 1) as well as under crossmodal conditions (Experiment 2). A further manipulation revealed, however, that IB is less prevalent when attention is divided acr...

Journal ArticleDOI
TL;DR: The largest crossmodal congruency effects were obtained when the visual distractor preceded the vibrotactile target by 50-100 ms, although significant effects were also reported when the distractor followed the target by as much as 100 ms.

Journal ArticleDOI
TL;DR: This work shows that the sign of cross-modal interactions depends on whether the content of two modalities is associated or not, and illustrates an ecologically optimal flexibility of the neural mechanisms that govern multisensory processing.
Abstract: Previous studies have shown that processing information in one sensory modality can either be enhanced or attenuated by concurrent stimulation of another modality. Here, we reconcile these apparently contradictory results by showing that the sign of cross-modal interactions depends on whether the content of two modalities is associated or not. When concurrently presented auditory and visual stimuli are paired by chance, cue-induced preparatory neural activity is strongly enhanced in the task-relevant sensory system and suppressed in the irrelevant system. Conversely, when information in the two modalities is reliably associated, activity is enhanced in both systems regardless of which modality is task relevant. Our findings illustrate an ecologically optimal flexibility of the neural mechanisms that govern multisensory processing: facilitation occurs when integration is expected, and suppression occurs when distraction is expected. Because thalamic structures were more active when the senses needed to operate separately, we propose them to serve gatekeeper functions in early cross-modal interactions.


Journal ArticleDOI
TL;DR: Older participants in the two younger age groups found it harder to attend selectively to targets in one modality, when distractor stimuli came from the same side rather than from the opposite side, which suggests that ageing may also compromise spatial aspects of crossmodal selective attention.

Journal ArticleDOI
TL;DR: It is shown that sentence processing can be affected by the concurrent processing of auditory stimuli and the processing of sentences depends on whether the sentences are presented in the auditory or visual modality.

Journal ArticleDOI
TL;DR: The model accounts for the finding that temporal discrimination depends on the presentation order of the sensory modalities, but fails to explain why temporal discrimination was much better with congruent than with incongruent trials.
Abstract: In this study, an extended pacemaker-counter model was applied to crossmodal temporal discrimination. In three experiments, subjects discriminated between the durations of a constant standard stimulus and a variable comparison stimulus. In congruent trials, both stimuli were presented in the same sensory modality (i.e., both visual or both auditory), whereas in incongruent trials, each stimulus was presented in a different modality. The model accounts for the finding that temporal discrimination depends on the presentation order of the sensory modalities. Nevertheless, the model fails to explain why temporal discrimination was much better with congruent than with incongruent trials. The discussion considers possibilities to accommodate the model to this and other shortcomings.

Journal ArticleDOI
TL;DR: It is suggested that attention to visually perceived speech gestures modulates auditory cortex function and that this modulation takes place at a hierarchically relatively early processing level.
Abstract: Observing a speaker's articulatory gestures can contribute considerably to auditory speech perception. At the level of neural events, seen articulatory gestures can modify auditory cortex responses to speech sounds and modulate auditory cortex activity also in the absence of heard speech. However, possible effects of attention on this modulation have remained unclear. To investigate the effect of attention on visual speech-induced auditory cortex activity, we scanned 10 healthy volunteers with functional magnetic resonance imaging (fMRI) at 3 T during simultaneous presentation of visual speech gestures and moving geometrical forms, with the instruction to either focus on or ignore the seen articulations. Secondary auditory cortex areas in the bilateral posterior superior temporal gyrus and planum temporale were active both when the articulatory gestures were ignored and when they were attended to. However, attention to visual speech gestures enhanced activity in the left planum temporale compared to the situation when the subjects saw identical stimuli but engaged in a nonspeech motion discrimination task. These findings suggest that attention to visually perceived speech gestures modulates auditory cortex function and that this modulation takes place at a hierarchically relatively early processing level.

Journal ArticleDOI
TL;DR: The data suggest that crossmodal effects are differentially modulated according to the hierarchical core-belt organization of auditory cortex.

Journal ArticleDOI
TL;DR: The results suggest that representations of unfamiliar objects are primarily visual but that crossmodal memory for familiar objects may rely on a network of different representations.
Abstract: Two experiments used visual-, verbal-, and haptic-interference tasks during encoding (Experiment 1) and retrieval (Experiment 2) to examine mental representation of familiar and unfamiliar objects in visual/haptic crossmodal memory. Three competing theories are discussed, which variously suggest that these representations are: (a) visual; (b) dual-code—visual for unfamiliar objects but visual and verbal for familiar objects; or (c) amodal. The results suggest that representations of unfamiliar objects are primarily visual but that crossmodal memory for familiar objects may rely on a network of different representations. The pattern of verbal-interference effects suggests that verbal strategies facilitate encoding of unfamiliar objects regardless of modality, but only haptic recognition regardless of familiarity. The results raise further research questions about all three theoretical approaches.

Journal ArticleDOI
TL;DR: Overall, the results of the experiments reveal that older observers can effectively perceive 3-D shape from both vision and haptics.
Abstract: One hundred observers participated in two experiments designed to investigate aging and the perception of natural object shape. In the experiments, younger and older observers performed either a same/different shape discrimination task (experiment 1) or a cross-modal matching task (experiment 2). Quantitative effects of age were found in both experiments. The effect of age in experiment 1 was limited to cross-modal shape discrimination: there was no effect of age upon unimodal (ie within a single perceptual modality) shape discrimination. The effect of age in experiment 2 was eliminated when the older observers were either given an unlimited amount of time to perform the task or when the number of response alternatives was decreased. Overall, the results of the experiments reveal that older observers can effectively perceive 3-D shape from both vision and haptics.


Journal ArticleDOI
TL;DR: Task-related differences in BOLD response were observed in the right intraparietal sulcus and in the left superior temporal sulcus, providing a direct confirmation of the "what-where" functional segregation in the crossmodal audiovisual domain.

Journal ArticleDOI
TL;DR: It is confirmed that ageing deleteriously affects crossmodal temporal processing even when the spatial confound inherent in previous research has been ruled out.

Journal ArticleDOI
TL;DR: In this article, Rees et al. used fMRl to study the neural correlates of cross-modal, visual-tactile extinction in a single case (patient GK).

Journal ArticleDOI
TL;DR: The results showed a progressive improvement of visual detections during the training and an improvement ofvisual oculomotor exploration that allowed patients to efficiently compensate for the loss of vision, very promising with respect to the possibility of taking advantage of human multisensory capabilities to recover from unimodal sensory impairments.

Journal ArticleDOI
TL;DR: The current data do not support an additional influence of crossmodal integration on exogenous orienting, but are well in agreement with the existence of a supramodal spatial attention module that allocates attentional resources towards stimulated locations for different sensory modalities.
Abstract: The aim of this study was to establish whether spatial attention triggered by bimodal exogenous cues acts differently as compared to unimodal and crossmodal exogenous cues due to crossmodal integration. In order to investigate this issue, we examined cuing effects in discrimination tasks and compared these effects in a condition wherein a visual target was preceded by both visual and auditory exogenous cues delivered simultaneously at the same side (bimodal cue), with conditions wherein the visual target was preceded by either a visual (unimodal cue) or an auditory cue (crossmodal cue). The results of two experiments revealed that cuing effects on RTs in these three conditions with an SOA of 200 ms had comparable magnitudes. Differences at a longer SOA of 600 ms (inhibition of return for bimodal cues, Experiment 1) disappeared when catch trials were included (in Experiment 2). The current data do not support an additional influence of crossmodal integration on exogenous orienting, but are well in agreement with the existence of a supramodal spatial attention module that allocates attentional resources towards stimulated locations for different sensory modalities.

Journal ArticleDOI
TL;DR: These results demonstrate, for the first time, significant spatial and postural modulations of crossmodal congruency effects in a non-spatial discrimination task.

Journal ArticleDOI
TL;DR: It is demonstrated in neurologically normal subjects that in addition to small increases in tactile sensitivity when a non-informative, suprathreshold visual stimulus is presented, there are highly consistent changes in response criteria for reporting touch with vision, even when no tactile stimulus is delivered.

Journal ArticleDOI
TL;DR: This study studied whether tactile change blindness might also be elicited by the presentation of a visual mask, and found that this crossmodal effect reflected a genuine perceptual impairment.