scispace - formally typeset
Search or ask a question

Showing papers on "Visual perception published in 2015"


Posted Content
TL;DR: This work introduces an artificial system based on a Deep Neural Network that creates artistic images of high perceptual quality and offers a path forward to an algorithmic understanding of how humans create and perceive artistic imagery.
Abstract: In fine art, especially painting, humans have mastered the skill to create unique visual experiences through composing a complex interplay between the content and style of an image. Thus far the algorithmic basis of this process is unknown and there exists no artificial system with similar capabilities. However, in other key areas of visual perception such as object and face recognition near-human performance was recently demonstrated by a class of biologically inspired vision models called Deep Neural Networks. Here we introduce an artificial system based on a Deep Neural Network that creates artistic images of high perceptual quality. The system uses neural representations to separate and recombine content and style of arbitrary images, providing a neural algorithm for the creation of artistic images. Moreover, in light of the striking similarities between performance-optimised artificial neural networks and biological vision, our work offers a path forward to an algorithmic understanding of how humans create and perceive artistic imagery.

1,019 citations


Journal ArticleDOI
TL;DR: The purpose of this article is to describe the fundamental stimulation paradigms for steady-state visual evoked potentials and to illustrate these principles through research findings across a range of applications in vision science.
Abstract: Periodic visual stimulation and analysis of the resulting steady-state visual evoked potentials were first introduced over 80 years ago as a means to study visual sensation and perception. From the first single-channel recording of responses to modulated light to the present use of sophisticated digital displays composed of complex visual stimuli and high-density recording arrays, steady-state methods have been applied in a broad range of scientific and applied settings.The purpose of this article is to describe the fundamental stimulation paradigms for steady-state visual evoked potentials and to illustrate these principles through research findings across a range of applications in vision science.

875 citations


Proceedings ArticleDOI
07 Dec 2015
TL;DR: In this paper, the authors investigated if the awareness of egomotion (i.e. self motion) can be used as a supervisory signal for feature learning and found that features learnt using self-motion as supervision compare favourably to features learned using class-label as supervision on the tasks of scene recognition, object recognition, visual odometry and keypoint matching.
Abstract: The current dominant paradigm for feature learning in computer vision relies on training neural networks for the task of object recognition using millions of hand labelled images. Is it also possible to learn features for a diverse set of visual tasks using any other form of supervision? In biology, living organisms developed the ability of visual perception for the purpose of moving and acting in the world. Drawing inspiration from this observation, in this work we investigated if the awareness of egomotion(i.e. self motion) can be used as a supervisory signal for feature learning. As opposed to the knowledge of class labels, information about egomotion is freely available to mobile agents. We found that using the same number of training images, features learnt using egomotion as supervision compare favourably to features learnt using class-label as supervision on the tasks of scene recognition, object recognition, visual odometry and keypoint matching.

552 citations


Journal ArticleDOI
17 Jun 2015-Neuron
TL;DR: It is determined how learning modifies neural representations in primary visual cortex (V1) during acquisition of a visually guided behavioral task and diverse mechanisms that modify sensory and non-sensory representations in V1 to adjust its processing to task requirements and the behavioral relevance of visual stimuli.

362 citations


Journal ArticleDOI
19 Aug 2015-Neuron
TL;DR: In this paper, the authors used fMRI-based reconstructions of remembered visual details from region-level activation patterns and found high-fidelity representations of a remembered orientation based on activation patterns in occipital visual cortex and in several sub-regions of frontal and parietal cortex.

317 citations


Journal ArticleDOI
TL;DR: Assessing the relationship between two-flash fusion thresholds (a measure of the temporal resolution of visual perception) and the frequency of eyes-closed and task-related alpha rhythms found that faster alpha frequencies predicted more accurate flash discrimination, providing novel evidence linking alpha frequency to the temporalresolution of perception.

304 citations


Journal ArticleDOI
TL;DR: Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex and unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition.
Abstract: To form a veridical percept of the environment, the brain needs to integrate sensory signals from a common source but segregate those from independent sources. Thus, perception inherently relies on solving the "causal inference problem." Behaviorally, humans solve this problem optimally as predicted by Bayesian Causal Inference; yet, the underlying neural mechanisms are unexplored. Combining psychophysics, Bayesian modeling, functional magnetic resonance imaging (fMRI), and multivariate decoding in an audiovisual spatial localization task, we demonstrate that Bayesian Causal Inference is performed by a hierarchy of multisensory processes in the human brain. At the bottom of the hierarchy, in auditory and visual areas, location is represented on the basis that the two signals are generated by independent sources (= segregation). At the next stage, in posterior intraparietal sulcus, location is estimated under the assumption that the two signals are from a common source (= forced fusion). Only at the top of the hierarchy, in anterior intraparietal sulcus, the uncertainty about the causal structure of the world is taken into account and sensory signals are combined as predicted by Bayesian Causal Inference. Characterizing the computational operations of signal interactions reveals the hierarchical nature of multisensory perception in human neocortex. It unravels how the brain accomplishes Bayesian Causal Inference, a statistical computation fundamental for perception and cognition. Our results demonstrate how the brain combines information in the face of uncertainty about the underlying causal structure of the world.

264 citations


Journal ArticleDOI
TL;DR: A formal meta-analytic approach is applied and a difference in the temporal pattern of the local-global balance is revealed, that is, slow global processing in individuals with ASD.
Abstract: What does an individual with autism spectrum disorder (ASD) perceive first: the forest or the trees? In spite of 30 years of research and influential theories like the weak central coherence (WCC) theory and the enhanced perceptual functioning (EPF) account, the interplay of local and global visual processing in ASD remains only partly understood. Research findings vary in indicating a local processing bias or a global processing deficit, and often contradict each other. We have applied a formal meta-analytic approach and combined 56 articles that tested about 1,000 ASD participants and used a wide range of stimuli and tasks to investigate local and global visual processing in ASD. Overall, results show no enhanced local visual processing nor a deficit in global visual processing. Detailed analysis reveals a difference in the temporal pattern of the local–global balance, that is, slow global processing in individuals with ASD. Whereas task-dependent interaction effects are obtained, gender, age, and IQ of either participant groups seem to have no direct influence on performance. Based on the overview of the literature, suggestions are made for future research.

242 citations


Posted Content
TL;DR: It is found that using the same number of training images, features learnt using egomotion as supervision compare favourably to features learnt with class-label as supervision on the tasks of scene recognition, object recognition, visual odometry and keypoint matching.
Abstract: The dominant paradigm for feature learning in computer vision relies on training neural networks for the task of object recognition using millions of hand labelled images. Is it possible to learn useful features for a diverse set of visual tasks using any other form of supervision? In biology, living organisms developed the ability of visual perception for the purpose of moving and acting in the world. Drawing inspiration from this observation, in this work we investigate if the awareness of egomotion can be used as a supervisory signal for feature learning. As opposed to the knowledge of class labels, information about egomotion is freely available to mobile agents. We show that given the same number of training images, features learnt using egomotion as supervision compare favourably to features learnt using class-label as supervision on visual tasks of scene recognition, object recognition, visual odometry and keypoint matching.

233 citations


Journal ArticleDOI
TL;DR: Studies of typical and atypical visual attention development are reviewed and how they offer insights into the mechanisms of adult visual attention are explained.
Abstract: Visual attention functions as a filter to select environmental information for learning and memory, making it the first step in the eventual cascade of thought and action systems. Here, we review studies of typical and atypical visual attention development and explain how they offer insights into the mechanisms of adult visual attention. We detail interactions between visual processing and visual attention, as well as the contribution of visual attention to memory. Finally, we discuss genetic mechanisms underlying attention disorders and how attention may be modified by training.

227 citations


Journal ArticleDOI
TL;DR: These findings provide direct evidence that forming predictions about when a stimulus will appear can bias the phase of ongoing alpha-band oscillations toward an optimal phase for visual processing, and may thus serve as a mechanism for the top-down control of visual processing guided by temporal predictions.
Abstract: The physiological state of the brain before an incoming stimulus has substantial consequences for subsequent behavior and neural processing. For example, the phase of ongoing posterior alpha-band oscillations (8–14 Hz) immediately before visual stimulation has been shown to predict perceptual outcomes and downstream neural activity. Although this phenomenon suggests that these oscillations may phasically route information through functional networks, many accounts treat these periodic effects as a consequence of ongoing activity that is independent of behavioral strategy. Here, we investigated whether alpha-band phase can be guided by top-down control in a temporal cueing task. When participants were provided with cues predictive of the moment of visual target onset, discrimination accuracy improved and targets were more frequently reported as consciously seen, relative to unpredictive cues. This effect was accompanied by a significant shift in the phase of alpha-band oscillations, before target onset, toward each participant’s optimal phase for stimulus discrimination. These findings provide direct evidence that forming predictions about when a stimulus will appear can bias the phase of ongoing alpha-band oscillations toward an optimal phase for visual processing, and may thus serve as a mechanism for the top-down control of visual processing guided by temporal predictions.

Journal ArticleDOI
TL;DR: The key perceptual principles, namely, retinal photoreception, sensory channels, opponent processing, color constancy, and receptor noise, are discussed, to inform an analytical framework driven by the research question in relation to identifiable viewers and visual tasks of interest.
Abstract: The world in color presents a dazzling dimension of phenotypic variation. Biological interest in this variation has burgeoned, due to both increased means for quantifying spectral information and heightened appreciation for how animals view the world differently than humans. Effective study of color traits is challenged by how to best quantify visual perception in nonhuman species. This requires consideration of at least visual physiology but ultimately also the neural processes underlying perception. Our knowledge of color perception is founded largely on the principles gained from human psychophysics that have proven generalizable based on comparative studies in select animal models. Appreciation of these principles, their empirical foundation, and the reasonable limits to their applicability is crucial to reaching informed conclusions in color research. In this article, we seek a common intellectual basis for the study of color in nature. We first discuss the key perceptual principles, namely, retinal photoreception, sensory channels, opponent processing, color constancy, and receptor noise. We then draw on this basis to inform an analytical framework driven by the research question in relation to identifiable viewers and visual tasks of interest. Consideration of the limits to perceptual inference guides two primary decisions: first, whether a sensory-based approach is necessary and justified and, second, whether the visual task refers to perceptual distance or discriminability. We outline informed approaches in each situation and discuss key challenges for future progress, focusing particularly on how animals perceive color. Given that animal behavior serves as both the basic unit of psychophysics and the ultimate driver of color ecology/evolution, behavioral data are critical to reconciling knowledge across the schools of color research.

Journal ArticleDOI
TL;DR: Progress is described using the computational neuroimaging approach in human visual cortex, which aims to build models that predict the neural responses from the stimulus and task.

Journal ArticleDOI
TL;DR: Challenges facing developers of cortical visual prostheses are detail, in addition to briefly outlining the epidemiology of blindness, and the history of cortical electrical stimulation in the context of visual prosthetics.

Journal ArticleDOI
TL;DR: The proposed dual content model of color representation demonstrates how the main consequence of visual working memory maintenance is the amplification of category related biases and stimulus-specific variability that originate in perception.
Abstract: Categorization with basic color terms is an intuitive and universal aspect of color perception. Yet research on visual working memory capacity has largely assumed that only continuous estimates within color space are relevant to memory. As a result, the influence of color categories on working memory remains unknown. We propose a dual content model of color representation in which color matches to objects that are either present (perception) or absent (memory) integrate category representations along with estimates of specific values on a continuous scale (“particulars”). We develop and test the model through 4 experiments. In a first experiment pair, participants reproduce a color target, both with and without a delay, using a recently influential estimation paradigm. In a second experiment pair, we use standard methods in color perception to identify boundary and focal colors in the stimulus set. The main results are that responses drawn from working memory are significantly biased away from category boundaries and toward category centers. Importantly, the same pattern of results is present without a memory delay. The proposed dual content model parsimoniously explains these results, and it should replace prevailing single content models in studies of visual working memory. More broadly, the model and the results demonstrate how the main consequence of visual working memory maintenance is the amplification of category related biases and stimulus-specific variability that originate in perception.

Journal ArticleDOI
04 Feb 2015-Neuron
TL;DR: Using multivoxel pattern analysis in visual cortex, it is found that the encoding of Reward-associated targets was enhanced, whereas encoding of reward-associated distractors was suppressed, with the strength of this effect predicted by activity in the dopaminergic midbrain and a connected cortical network.

Journal ArticleDOI
TL;DR: This work finds that activity patterns in early visual areas (V1-V4) are strongly biased in favor of the attended object, and shows how feedback of an average template to V1-like units can improve discrimination of exemplars belonging to the attended category.
Abstract: What neural mechanisms underlie the ability to attend to a complex object in the presence of competing overlapping stimuli? We evaluated whether object-based attention might involve pattern-specific feedback to early visual areas to selectively enhance the set of lowlevel features corresponding to the attended object. Using fMRI and multivariate pattern analysis, we found that activity patterns in early visual areas (V1–V4) are strongly biased in favor of the attended object. Activity patterns evoked by single faces and single houses reliably predicted which of the 2 overlapping stimulus types was being attended with high accuracy (80–90% correct). Superior knowledge of upright objects led to improved attentional selection in early areas. Across individual blocks, the strength of the attentional bias signal in early visual areas was highly predictive of the modulations found in high-level object areas, implying that pattern-specific attentional filtering at early sites can determine the quality of objectspecific signals that reach higher level visual areas. Through computational modeling, we show how feedback of an average template to V1-like units can improve discrimination of exemplars belonging to the attended category. Our findings provide a mechanistic account of how feedback to early visual areas can contribute to the attentional selection of complex objects.

Journal ArticleDOI
TL;DR: The current paper reviews some current uses for VR environments in psychological research and discusses some ongoing questions for researchers, focusing on the area of visual perception, where both the advantages and challenges of VR are particularly salient.
Abstract: Recent proliferation of available virtual reality (VR) tools has seen increased use in psychological research. This is due to a number of advantages afforded over traditional experimental apparatus such as tighter control of the environment and the possibility of creating more ecologically valid stimulus presentation and response protocols. At the same time, higher levels of immersion and visual fidelity afforded by VR do not necessarily evoke presence or elicit a “realistic” psychological response. The current paper reviews some current uses for VR environments in psychological research and discusses some ongoing questions for researchers. Finally, we focus on the area of visual perception, where both the advantages and challenges of VR are particularly salient.

Book
01 Jun 2015
TL;DR: It is argued that rapid object categorizations in natural scenes can be done without focused attention and are most likely based on coarse and unconscious visual representations activated with the first available (magnocellular) visual information.
Abstract: Visual categorization appears both effortless and virtually instantaneous, the study by Thorpe et al. (1996) was the first to estimate the processing time necessary to perform fast visual categorization of animals in briefly flashed (20ms) natural photographs. They observed a large differential EEG activity between target and distrater correct trials that developed from 150 ms after stimulus onset. A value that was later shown to be even shorter in monkeys! With such strong processing time constraints, it was difficult to escape the conclusion that rapid visual categorization was relying on massively parallel, essentially feed-forward processing of visual information. Since 1996, we have conducted a large number of studies to determine the characteristics and limits of fast visual categorization. The present chapter will review some of the main results obtained. I will argue that rapid object categorizations in natural scenes can be done without focused attention and are most likely based on coarse and unconscious visual representations activated with the first available (magnocellular) visual information. Fast visual processing proved efficient for the categorization of large superordinate object or scene categories, but shows its limits when more detailed basic representations are required. Basic objects (dogs, cars) or scenes (mountain or sea landscapes) representations need additionnal processing time to be activated. A finding that is at odds with the widely accepted idea that such basic representations are at the entry level of the system. Interestingly, focused attention is still not required to perform such, more time consuming, basic categorizations. Finally we will show that object and context processing can interact very early in an ascending wave of visual information processing. We will discuss how such data could result from our experience with a highly structured and predictable surrounding world that shaped neuronal visual selectivity.

Journal ArticleDOI
TL;DR: It is demonstrated that ECoG responses in human visual cortex (V1/V2/V3) can include robust narrowband gamma oscillations, and that these oscillations are reliably elicited by some spatial contrast patterns (luminance gratings) but not by others (noise patterns and many natural images).
Abstract: A striking feature of some field potential recordings in visual cortex is a rhythmic oscillation within the gamma band (30–80 Hz). These oscillations have been proposed to underlie computations in perception, attention, and information transmission. Recent studies of cortical field potentials, including human electrocorticography (ECoG), have emphasized another signal within the gamma band, a nonoscillatory, broadband signal, spanning 80–200 Hz. It remains unclear under what conditions gamma oscillations are elicited in visual cortex, whether they are necessary and ubiquitous in visual encoding, and what relationship they have to nonoscillatory, broadband field potentials. We demonstrate that ECoG responses in human visual cortex (V1/V2/V3) can include robust narrowband gamma oscillations, and that these oscillations are reliably elicited by some spatial contrast patterns (luminance gratings) but not by others (noise patterns and many natural images). The gamma oscillations can be conspicuous and robust, but because they are absent for many stimuli, which observers can see and recognize, the oscillations are not necessary for seeing. In contrast, all visual stimuli induced broadband spectral changes in ECoG responses. Asynchronous neural signals in visual cortex, reflected in the broadband ECoG response, can support transmission of information for perception and recognition in the absence of pronounced gamma oscillations.

Journal ArticleDOI
TL;DR: It is found that visual neurons in Drosophila receive motor-related inputs during rapid flight turns, which echo the suppression of visual perception during rapid eye movements in primates, demonstrating common functional principles of sensorimotor processing across phyla.
Abstract: Each time a locomoting fly turns, the visual image sweeps over the retina and generates a motion stimulus. Classic behavioral experiments suggested that flies use active neural-circuit mechanisms to suppress the perception of self-generated visual motion during intended turns. Direct electrophysiological evidence, however, has been lacking. We found that visual neurons in Drosophila receive motor-related inputs during rapid flight turns. These inputs arrived with a sign and latency appropriate for suppressing each targeted cell's visual response to the turn. Precise measurements of behavioral and neuronal response latencies supported the idea that motor-related inputs to optic flow-processing cells represent internal predictions of the expected visual drive induced by voluntary turns. Motor-related inputs to small object-selective visual neurons could reflect either proprioceptive feedback from the turn or internally generated signals. Our results in Drosophila echo the suppression of visual perception during rapid eye movements in primates, demonstrating common functional principles of sensorimotor processing across phyla.

12 Nov 2015
TL;DR: In this article, a Deep Q Network (DQNets) was used to learn target reaching with a three-joint robot manipulator using external visual observation, which was demonstrated to perform target reaching after training in simulation.
Abstract: This paper introduces a machine learning based system for controlling a robotic manipulator with visual perception only. The capability to autonomously learn robot controllers solely from raw-pixel images and without any prior knowledge of configuration is shown for the first time. We build upon the success of recent deep reinforcement learning and develop a system for learning target reaching with a three-joint robot manipulator using external visual observation. A Deep Q Network (DQN) was demonstrated to perform target reaching after training in simulation. Transferring the network to real hardware and real observation in a naive approach failed, but experiments show that the network works when replacing camera images with synthetic images.

Journal ArticleDOI
TL;DR: This fast periodic visual stimulation approach provides a direct signature of natural face categorization and opens an avenue for efficiently measuring categorization responses of complex visual stimuli in the human brain.
Abstract: We designed a fast periodic visual stimulation approach to identify an objective signature of face categorization incorporating both visual discrimination (from nonface objects) and generalization (across widely variable face exemplars). Scalp electroencephalographic (EEG) data were recorded in 12 human observers viewing natural images of objects at a rapid frequency of 5.88 images/s for 60 s. Natural images of faces were interleaved every five stimuli, i.e., at 1.18 Hz (5.88/5). Face categorization was indexed by a high signal-to-noise ratio response, specifically at an oddball face stimulation frequency of 1.18 Hz and its harmonics. This face-selective periodic EEG response was highly significant for every participant, even for a single 60-s sequence, and was generally localized over the right occipitotemporal cortex. The periodicity constraint and the large selection of stimuli ensured that this selective response to natural face images was free of low-level visual confounds, as confirmed by the absence of any oddball response for phase-scrambled stimuli. Without any subtraction procedure, time-domain analysis revealed a sequence of differential face-selective EEG components between 120 and 400 ms after oddball face image onset, progressing from medial occipital (P1-faces) to occipitotemporal (N1-faces) and anterior temporal (P2-faces) regions. Overall, this fast periodic visual stimulation approach provides a direct signature of natural face categorization and opens an avenue for efficiently measuring categorization responses of complex visual stimuli in the human brain

Journal ArticleDOI
TL;DR: The present evidence supports a model in which the natural statistics of temporal information in the visual world may affect domain-specific temporal processing and encoding capacities, and the functional organization of high-level visual cortex may be constrained by temporal characteristics of stimuli in the natural world.
Abstract: Prevailing hierarchical models propose that temporal processing capacity--the amount of information that a brain region processes in a unit time--decreases at higher stages in the ventral stream regardless of domain. However, it is unknown if temporal processing capacities are domain general or domain specific in human high-level visual cortex. Using a novel fMRI paradigm, we measured temporal capacities of functional regions in high-level visual cortex. Contrary to hierarchical models, our data reveal domain-specific processing capacities as follows: (1) regions processing information from different domains have differential temporal capacities within each stage of the visual hierarchy and (2) domain-specific regions display the same temporal capacity regardless of their position in the processing hierarchy. In general, character-selective regions have the lowest capacity, face- and place-selective regions have an intermediate capacity, and body-selective regions have the highest capacity. Notably, domain-specific temporal processing capacities are not apparent in V1 and have perceptual implications. Behavioral testing revealed that the encoding capacity of body images is higher than that of characters, faces, and places, and there is a correspondence between peak encoding rates and cortical capacities for characters and bodies. The present evidence supports a model in which the natural statistics of temporal information in the visual world may affect domain-specific temporal processing and encoding capacities. These findings suggest that the functional organization of high-level visual cortex may be constrained by temporal characteristics of stimuli in the natural world, and this temporal capacity is a characteristic of domain-specific networks in high-level visual cortex. Significance statement: Visual stimuli bombard us at different rates every day. For example, words and scenes are typically stationary and vary at slow rates. In contrast, bodies are dynamic and typically change at faster rates. Using a novel fMRI paradigm, we measured temporal processing capacities of functional regions in human high-level visual cortex. Contrary to prevailing theories, we find that different regions have different processing capacities, which have behavioral implications. In general, character-selective regions have the lowest capacity, face- and place-selective regions have an intermediate capacity, and body-selective regions have the highest capacity. These results suggest that temporal processing capacity is a characteristic of domain-specific networks in high-level visual cortex and contributes to the segregation of cortical regions.

Journal ArticleDOI
TL;DR: The results reveal a gating of a basic, widespread cortical computation by inference about the statistics of natural input in macaque primary visual cortex with natural images.
Abstract: Identical sensory inputs can be perceived as markedly different when embedded in distinct contexts. Neural responses to simple stimuli are also modulated by context, but the contribution of this modulation to the processing of natural sensory input is unclear. We measured surround suppression, a quintessential contextual influence, in macaque primary visual cortex with natural images. We found that suppression strength varied substantially for different images. This variability was not well explained by existing descriptions of surround suppression, but it was predicted by Bayesian inference about statistical dependencies in images. In this framework, surround suppression was flexible: it was recruited when the image was inferred to contain redundancies and substantially reduced in strength otherwise. Thus, our results reveal a gating of a basic, widespread cortical computation by inference about the statistics of natural input.

Journal ArticleDOI
TL;DR: Using fMRI in combination with a generative model-based analysis, it is found that probability distributions reflecting sensory uncertainty could reliably be estimated from human visual cortex and, moreover, that observers appeared to use knowledge of this uncertainty in their perceptual decisions.
Abstract: Bayesian theories of neural coding propose that sensory uncertainty is represented by a probability distribution encoded in neural population activity, but direct neural evidence supporting this hypothesis is currently lacking. Using fMRI in combination with a generative model-based analysis, we found that probability distributions reflecting sensory uncertainty could reliably be estimated from human visual cortex and, moreover, that observers appeared to use knowledge of this uncertainty in their perceptual decisions.

Journal ArticleDOI
TL;DR: It is demonstrated that the brain's representation of auditory speech is enhanced when the accompanying visual speech signal shares the same timing, and this enhancement is most pronounced at a time scale that corresponds to mean syllable length.
Abstract: Congruent audiovisual speech enhances our ability to comprehend a speaker, even in noise-free conditions. When incongruent auditory and visual information is presented concurrently, it can hinder a listener9s perception and even cause him or her to perceive information that was not presented in either modality. Efforts to investigate the neural basis of these effects have often focused on the special case of discrete audiovisual syllables that are spatially and temporally congruent, with less work done on the case of natural, continuous speech. Recent electrophysiological studies have demonstrated that cortical response measures to continuous auditory speech can be easily obtained using multivariate analysis methods. Here, we apply such methods to the case of audiovisual speech and, importantly, present a novel framework for indexing multisensory integration in the context of continuous speech. Specifically, we examine how the temporal and contextual congruency of ongoing audiovisual speech affects the cortical encoding of the speech envelope in humans using electroencephalography. We demonstrate that the cortical representation of the speech envelope is enhanced by the presentation of congruent audiovisual speech in noise-free conditions. Furthermore, we show that this is likely attributable to the contribution of neural generators that are not particularly active during unimodal stimulation and that it is most prominent at the temporal scale corresponding to syllabic rate (2–6 Hz). Finally, our data suggest that neural entrainment to the speech envelope is inhibited when the auditory and visual streams are incongruent both temporally and contextually. SIGNIFICANCE STATEMENT Seeing a speaker9s face as he or she talks can greatly help in understanding what the speaker is saying. This is because the speaker9s facial movements relay information about what the speaker is saying, but also, importantly, when the speaker is saying it. Studying how the brain uses this timing relationship to combine information from continuous auditory and visual speech has traditionally been methodologically difficult. Here we introduce a new approach for doing this using relatively inexpensive and noninvasive scalp recordings. Specifically, we show that the brain9s representation of auditory speech is enhanced when the accompanying visual speech signal shares the same timing. Furthermore, we show that this enhancement is most pronounced at a time scale that corresponds to mean syllable length.

Book
15 May 2015
TL;DR: A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described.
Abstract: Neurophysiological evidence for invariant representations of objects and faces in the primate inferior temporal visual cortex is described. Then a computational approach to how invariant representations are formed in the brain is described that builds on the neurophysiology. A feature hierarchy model in which invariant representations can be built by self-organizing learning based on the temporal and spatial statistics of the visual input produced by objects as they transform in the world is described. VisNet can use temporal continuity in an associative synaptic learning rule with a short-term memory trace, and/or it can use spatial continuity in continuous spatial transformation learning which does not require a temporal trace. The model of visual processing in the ventral cortical stream can build representations of objects that are invariant with respect to translation, view, size, and also lighting. The model has been extended to provide an account of invariant representations in the dorsal visual system of the global motion produced by objects such as looming, rotation, and object-based movement. The model has been extended to incorporate top-down feedback connections to model the control of attention by biased competition in, for example, spatial and object search tasks. The approach has also been extended to account for how the visual system can select single objects in complex visual scenes, and how multiple objects can be represented in a scene. The approach has also been extended to provide, with an additional layer, for the development of representations of spatial scenes of the type found in the hippocampus.

Journal ArticleDOI
TL;DR: In this paper, different photometric variables also influence visual perception and the comfort of the lighting, as well as subjective non-visual variables such as mood, alertness and well-being.
Abstract: Lighting conditions in workplaces contribute to a variety of factors related to work satisfaction, productivity and well-being. We tested whether different photometric variables also influence visual perception and the comfort of the lighting, as well as subjective non-visual variables such as mood, alertness and well-being. Twenty-five young subjects spent two afternoons either under electric light or daylighting conditions (without view from the window). Subjects overall preferred the daylighting for visual acceptance and glare. Changes of photometric variables modulated changes in visual light perception, alertness and mood in the course of the afternoon. Finally, we found several associations of visual and non-visual functions, indicating a potential relationship of visual comfort with other circadian and wake-dependent functions in humans, which consequently could impact office lighting scenarios in the future.

Journal ArticleDOI
TL;DR: It is found that the human brain is uniquely sensitive to numerosity and more sensitive to changes in numerosity than toChanges in other visual properties, starting extremely early in the visual stream: 75 ms over a medial occipital site and 180 ms over bilateral occipitoparietal sites.
Abstract: Humans are endowed with an intuitive number sense that allows us to perceive and estimate numerosity without relying on language. Itiscontroversial,however,astowhether there isa neuralmechanismfor directperceptionof numerosityorwhether numerosity is perceived indirectly via other perceptual properties. In this study, we used a novel regression-based analytic method, which allowed an assessment of the unique contributions of visual properties, including numerosity, to explain visual evoked potentials of participants passively viewing dot arrays. We found that the human brain is uniquely sensitive to numerosity and more sensitive to changes in numerosity than to changes in other visual properties, starting extremelyearly in the visual stream: 75 ms over a medial occipital site and 180 ms over bilateral occipitoparietal sites. These findings provide strong evidence for the existence of a neural mechanism for rapidly and directly extracting numerosity information in the human visual pathway.