scispace - formally typeset
Search or ask a question

Showing papers on "Visual perception published in 2018"


Proceedings ArticleDOI
21 May 2018
TL;DR: In this paper, the authors proposed a method to improve the quality of visual underwater scenes using Generative Adversarial Networks (GANs), with the goal of improving input to vision-driven behaviors further down the autonomy pipeline.
Abstract: Autonomous underwater vehicles (AUVs) rely on a variety of sensors - acoustic, inertial and visual - for intelligent decision making. Due to its non-intrusive, passive nature and high information content, vision is an attractive sensing modality, particularly at shallower depths. However, factors such as light refraction and absorption, suspended particles in the water, and color distortion affect the quality of visual data, resulting in noisy and distorted images. AUVs that rely on visual sensing thus face difficult challenges and consequently exhibit poor performance on vision-driven tasks. This paper proposes a method to improve the quality of visual underwater scenes using Generative Adversarial Networks (GANs), with the goal of improving input to vision-driven behaviors further down the autonomy pipeline. Furthermore, we show how recently proposed methods are able to generate a dataset for the purpose of such underwater image restoration. For any visually-guided underwater robots, this improvement can result in increased safety and reliability through robust visual perception. To that effect, we present quantitative and qualitative data which demonstrates that images corrected through the proposed approach generate more visually appealing images, and also provide increased accuracy for a diver tracking algorithm.

387 citations


Journal ArticleDOI
04 May 2018-Science
TL;DR: This work investigated the fate of weak visual stimuli in the visual and frontal cortex of awake monkeys trained to report stimulus presence and proposed a model in which stimuli become consciously reportable when they elicit a nonlinear ignition process in higher cortical areas.
Abstract: Why are some visual stimuli consciously detected, whereas others remain subliminal? We investigated the fate of weak visual stimuli in the visual and frontal cortex of awake monkeys trained to report stimulus presence. Reported stimuli were associated with strong sustained activity in the frontal cortex, and frontal activity was weaker and quickly decayed for unreported stimuli. Information about weak stimuli could be lost at successive stages en route from the visual to the frontal cortex, and these propagation failures were confirmed through microstimulation of area V1. Fluctuations in response bias and sensitivity during perception of identical stimuli were traced back to prestimulus brain-state markers. A model in which stimuli become consciously reportable when they elicit a nonlinear ignition process in higher cortical areas explained our results.

250 citations


Journal ArticleDOI
TL;DR: It is found that mice can detect figures defined by an orientation that differs from the background while the figure size, position or phase varied, enabling investigation with the powerful techniques for circuit analysis now available in mice.
Abstract: Figure-ground segregation is the process by which the visual system identifies image elements of figures and segregates them from the background. Previous studies examined figure-ground segregation in the visual cortex of monkeys where figures elicit stronger neuronal responses than backgrounds. It was demonstrated in anesthetized mice that neurons in the primary visual cortex (V1) of mice are sensitive to orientation contrast, but it is unknown whether mice can perceptually segregate figures from a background. Here, we examined figure-ground perception of mice and found that mice can detect figures defined by an orientation that differs from the background while the figure size, position or phase varied. Electrophysiological recordings in V1 of awake mice revealed that the responses elicited by figures were stronger than those elicited by the background and even stronger at the edge between figure and background. A figural response could even be evoked in the absence of a stimulus in the V1 receptive field. Current-source-density analysis suggested that the extra activity was caused by synaptic inputs into layer 2/3. We conclude that the neuronal mechanisms of figure-ground segregation in mice are similar to those in primates, enabling investigation with the powerful techniques for circuit analysis now available in mice.

239 citations


Reference EntryDOI
Frank Tong1
23 Mar 2018

192 citations


Journal ArticleDOI
TL;DR: In this article, the authors combined psychophysics, physiology, and computational models to test the hypothesis that pattern completion is implemented by recurrent computations and present three pieces of evidence that are consistent with this hypothesis.
Abstract: Making inferences from partial information constitutes a critical aspect of cognition. During visual perception, pattern completion enables recognition of poorly visible or occluded objects. We combined psychophysics, physiology, and computational models to test the hypothesis that pattern completion is implemented by recurrent computations and present three pieces of evidence that are consistent with this hypothesis. First, subjects robustly recognized objects even when they were rendered <15% visible, but recognition was largely impaired when processing was interrupted by backward masking. Second, invasive physiological responses along the human ventral cortex exhibited visually selective responses to partially visible objects that were delayed compared with whole objects, suggesting the need for additional computations. These physiological delays were correlated with the effects of backward masking. Third, state-of-the-art feed-forward computational architectures were not robust to partial visibility. However, recognition performance was recovered when the model was augmented with attractor-based recurrent connectivity. The recurrent model was able to predict which images of heavily occluded objects were easier or harder for humans to recognize, could capture the effect of introducing a backward mask on recognition behavior, and was consistent with the physiological delays along the human ventral visual stream. These results provide a strong argument of plausibility for the role of recurrent computations in making visual inferences from partial information.

172 citations


Journal ArticleDOI
TL;DR: A new BCI speller based on miniature asymmetric visual evoked potentials (aVEPs), which encodes 32 characters with a space-code division multiple access scheme and decodes EEG features with a discriminative canonical pattern matching algorithm is developed.
Abstract: Goal: Traditional visual brain–computer interfaces (BCIs) preferred to use large-size stimuli to attract the user's attention and elicit distinct electroencephalography (EEG) features. However, the visual stimuli are of no interest to the users as they just serve as the hidden codes behind the characters. Furthermore, using stronger visual stimuli could cause visual fatigue and other adverse symptoms to users. Therefore, it's imperative for visual BCIs to use small and inconspicuous visual stimuli to code characters. Methods: This study developed a new BCI speller based on miniature asymmetric visual evoked potentials (aVEPs), which encodes 32 characters with a space-code division multiple access scheme and decodes EEG features with a discriminative canonical pattern matching algorithm. Notably, the visual stimulus used in this study only subtended 0.5° of visual angle and was placed outside the fovea vision on the lateral side, which could only induce a miniature potential about 0.5 μ V in amplitude and about 16.5 dB in signal-to-noise rate. A total of 12 subjects were recruited to use the miniature aVEP speller in both offline and online tests. Results: Information transfer rates up to 63.33 b/min could be achieved from online tests (online demo URL: https://www.youtube.com/edit?o=U&video_id=kC7btB3mvGY ). Conclusion: Experimental results demonstrate the feasibility of using very small and inconspicuous visual stimuli to implement an efficient BCI system, even though the elicited EEG features are very weak. Significance: The proposed innovative technique can broaden the category of BCIs and strengthen the brain-computer communication.

154 citations


Journal ArticleDOI
TL;DR: While alpha oscillations are strongly associated with reductions in visual attention, they also appear to play important roles in regulating the timing and temporal resolution of perception: top‐down control and may facilitate transmission of predictions to visual cortex.
Abstract: A central feature of human brain activity is the alpha rhythm: a 7-13 Hz oscillation observed most notably over occipitoparietal brain regions during periods of eyes-closed rest. Alpha oscillations covary with changes in visual processing and have been associated with a broad range of neurocognitive functions. In this article, we review these associations and suggest that alpha oscillations can be thought to exhibit at least five distinct 'characters': those of the inhibitor, perceiver, predictor, communicator and stabiliser. In short, while alpha oscillations are strongly associated with reductions in visual attention, they also appear to play important roles in regulating the timing and temporal resolution of perception. Furthermore, alpha oscillations are strongly associated with top-down control and may facilitate transmission of predictions to visual cortex. This is in addition to promoting communication between frontal and posterior brain regions more generally, as well as maintaining ongoing perceptual states. We discuss why alpha oscillations might associate with such a broad range of cognitive functions and suggest ways in which these diverse associations can be studied experimentally.

143 citations


Journal ArticleDOI
TL;DR: It is found that the peak frequency of alpha oscillations decreased when visual task demands required temporal integration compared with segregation, and alpha frequency was strategically modulated immediately before and during stimulus processing, suggesting a preparatory top-down source of modulation.
Abstract: Temporal integration in visual perception is thought to occur within cycles of occipital alpha-band (8–12 Hz) oscillations. Successive stimuli may be integrated when they fall within the same alpha cycle and segregated for different alpha cycles. Consequently, the speed of alpha oscillations correlates with the temporal resolution of perception, such that lower alpha frequencies provide longer time windows for perceptual integration and higher alpha frequencies correspond to faster sampling and segregation. Can the brain’s rhythmic activity be dynamically controlled to adjust its processing speed according to different visual task demands? We recorded magnetoencephalography (MEG) while participants switched between task instructions for temporal integration and segregation, holding stimuli and task difficulty constant. We found that the peak frequency of alpha oscillations decreased when visual task demands required temporal integration compared with segregation. Alpha frequency was strategically modulated immediately before and during stimulus processing, suggesting a preparatory top-down source of modulation. Its neural generators were located in occipital and inferotemporal cortex. The frequency modulation was specific to alpha oscillations and did not occur in the delta (1–3 Hz), theta (3–7 Hz), beta (15–30 Hz), or gamma (30–50 Hz) frequency range. These results show that alpha frequency is under top-down control to increase or decrease the temporal resolution of visual perception.

131 citations


Journal ArticleDOI
15 Sep 2018
TL;DR: In this article, the authors propose that a key mechanism is the reorganization of spatiotemporal visual fields, which transiently increases the temporal and spatial uncertainty of visual representations just before and during saccades.
Abstract: The perceptual consequences of eye movements are manifold: Each large saccade is accompanied by a drop of sensitivity to luminance-contrast, low-frequency stimuli, impacting both conscious vision and involuntary responses, including pupillary constrictions. They also produce transient distortions of space, time, and number, which cannot be attributed to the mere motion on the retinae. All these are signs that the visual system evokes active processes to predict and counteract the consequences of saccades. We propose that a key mechanism is the reorganization of spatiotemporal visual fields, which transiently increases the temporal and spatial uncertainty of visual representations just before and during saccades. On one hand, this accounts for the spatiotemporal distortions of visual perception; on the other hand, it implements a mechanism for fusing pre- and postsaccadic stimuli. This, together with the active suppression of motion signals, ensures the stability and continuity of our visual experience.

131 citations


Journal ArticleDOI
TL;DR: Both the group-average and individual-subject results reveal robust signals across much of the brain, including occipital, temporal, parietal, and frontal cortex as well as subcortical areas, and split-half analyses show strong within-subject reliability, further demonstrating the high quality of the data.
Abstract: About a quarter of human cerebral cortex is dedicated mainly to visual processing. The large-scale spatial organization of visual cortex can be measured with functional magnetic resonance imaging (fMRI) while subjects view spatially modulated visual stimuli, also known as "retinotopic mapping." One of the datasets collected by the Human Connectome Project involved ultrahigh-field (7 Tesla) fMRI retinotopic mapping in 181 healthy young adults (1.6-mm resolution), yielding the largest freely available collection of retinotopy data. Here, we describe the experimental paradigm and the results of model-based analysis of the fMRI data. These results provide estimates of population receptive field position and size. Our analyses include both results from individual subjects as well as results obtained by averaging fMRI time series across subjects at each cortical and subcortical location and then fitting models. Both the group-average and individual-subject results reveal robust signals across much of the brain, including occipital, temporal, parietal, and frontal cortex as well as subcortical areas. The group-average results agree well with previously published parcellations of visual areas. In addition, split-half analyses show strong within-subject reliability, further demonstrating the high quality of the data. We make publicly available the analysis results for individual subjects and the group average, as well as associated stimuli and analysis code. These resources provide an opportunity for studying fine-scale individual variability in cortical and subcortical organization and the properties of high-resolution fMRI. In addition, they provide a set of observations that can be compared with other Human Connectome Project measures acquired in these same participants.

131 citations


Proceedings ArticleDOI
01 Jan 2018
TL;DR: An emotion prioritization effect is discovered: for the authors' images, emotion-eliciting content attracts human attention strongly, but such advantage diminishes dramatically after initial fixation, and the proposed network outperforms the state-of-the-art on three benchmark datasets, by effectively capturing the relative importance of human attention within an image.
Abstract: Image sentiment influences visual perception. Emotion-eliciting stimuli such as happy faces and poisonous snakes are generally prioritized in human attention. However, little research has evaluated the interrelationships of image sentiment and visual saliency. In this paper, we present the first study to focus on the relation between emotional properties of an image and visual attention. We first create the EMOtional attention dataset (EMOd). It is a diverse set of emotion-eliciting images, and each image has (1) eye-tracking data collected from 16 subjects, (2) intensive image context labels including object contour, object sentiment, object semantic category, and high-level perceptual attributes such as image aesthetics and elicited emotions. We perform extensive analyses on EMOd to identify how image sentiment relates to human attention. We discover an emotion prioritization effect: for our images, emotion-eliciting content attracts human attention strongly, but such advantage diminishes dramatically after initial fixation. Aiming to model the human emotion prioritization computationally, we design a deep neural network for saliency prediction, which includes a novel subnetwork that learns the spatial and semantic context of the image scene. The proposed network outperforms the state-of-the-art on three benchmark datasets, by effectively capturing the relative importance of human attention within an image. The code, models, and dataset are available online at https://nus-sesame.top/emotionalattention/.

Journal ArticleDOI
07 Mar 2018-eLife
TL;DR: The striking dissociation between functional and DNN features in their contribution to behavioral and brain representations of scenes indicates that scene-selective cortex represents only a subset of behaviorally relevant scene information.
Abstract: Inherent correlations between visual and semantic features in real-world scenes make it difficult to determine how different scene properties contribute to neural representations. Here, we assessed the contributions of multiple properties to scene representation by partitioning the variance explained in human behavioral and brain measurements by three feature models whose inter-correlations were minimized a priori through stimulus preselection. Behavioral assessments of scene similarity reflected unique contributions from a functional feature model indicating potential actions in scenes as well as high-level visual features from a deep neural network (DNN). In contrast, similarity of cortical responses in scene-selective areas was uniquely explained by mid- and high-level DNN features only, while an object label model did not contribute uniquely to either domain. The striking dissociation between functional and DNN features in their contribution to behavioral and brain representations of scenes indicates that scene-selective cortex represents only a subset of behaviorally relevant scene information.

Posted ContentDOI
29 Jun 2018-bioRxiv
TL;DR: An open, large-scale physiological survey of neural activity in the awake mouse visual cortex: the Allen Brain Observatory Visual Coding dataset is reported, revealing functional differences across these dimensions and showing that visual cortical responses are sparse but correlated.
Abstract: To understand how the brain processes sensory information to guide behavior, we must know how stimulus representations are transformed throughout the visual cortex. Here we report an open, large-scale physiological survey of neural activity in the awake mouse visual cortex: the Allen Brain Observatory Visual Coding dataset. This publicly available dataset includes cortical activity from nearly 60,000 neurons collected from 6 visual areas, 4 layers, and 12 transgenic mouse lines from 221 adult mice, in response to a systematic set of visual stimuli. Using this dataset, we reveal functional differences across these dimensions and show that visual cortical responses are sparse but correlated. Surprisingly, responses to different stimuli are largely independent, e.g. whether a neuron responds to natural scenes provides no information about whether it responds to natural movies or to gratings. We show that these phenomena cannot be explained by standard local filter-based models, but are consistent with multi-layer hierarchical computation, as found in deeper layers of standard convolutional neural networks.

Journal ArticleDOI
01 Jul 2018-Autism
TL;DR: The cascading impact of sensory abilities in autism is described, whereby temporal processing impacts multisensory information of social information, which, in turn, contributes to deficits in speech perception.
Abstract: It has been recently theorized that atypical sensory processing in autism relates to difficulties in social communication Through a series of tasks concurrently assessing multisensory temporal processes, multisensory integration and speech perception in 76 children with and without autism, we provide the first behavioral evidence of such a link Temporal processing abilities in children with autism contributed to impairments in speech perception This relationship was significantly mediated by their abilities to integrate social information across auditory and visual modalities These data describe the cascading impact of sensory abilities in autism, whereby temporal processing impacts multisensory information of social information, which, in turn, contributes to deficits in speech perception These relationships were found to be specific to autism, specific to multisensory but not unisensory integration, and specific to the processing of social information

Journal ArticleDOI
TL;DR: This paper studies the problem of predicting head movement, head–eye motion, and scanpath of viewers when they are watching 360 degree images in the commodity HMDs and designs a model to predict the saliency maps for the first two, and the scanpaths for the last one.
Abstract: Estimating salient areas of visual stimuli which are liable to attract viewers’ visual attention is a challenging task because of the high complexity of cognitive behaviors in the brain. Many researchers have been dedicated to this field and obtained many achievements. Some application areas, ranging from computer vision, computer graphics, to multimedia processing, can benefit from saliency detection, considering that the detected saliency has depicted the visual importance of different areas of the visual stimuli. As for the 360 degree visual stimuli, images and videos should record the whole scene in the 3D world, so the resolutions of panoramic images and videos are usually very high. However, when watching 360 degree stimuli, observers can only see part of the scene in the view port, which is presented to the eyes of the observers through the Head Mounted Display (HMD). So sending the whole video, or rendering the whole scene may result in the waste of resources. Thus if we can predict the current field of view, then focuses can be put to the streaming and rendering of the scene in the current field of view. Further more, if we can predict salient areas in the scene, then more fine processing can be done to the visually important areas. The prediction of salient regions for traditional images and videos have been extensively studied. However, conventional saliency prediction methods are not fully adequate for 360 degree contents, because 360 degree stimuli own some unique characteristics. Related study in this area is limited. In this paper, we study the problem of predicting head movement, head–eye motion, and scanpath of viewers when they are watching 360 degree images in the commodity HMDs. Three types of data are specifically analyzed. The first is the head movement data, which can be regarded as the movement of the view port. The second is the head–eye motion data which combines the motion of the head and the movement of the eye within the view port. The third is the scan-paths data of observers in the entire panorama which record the position information as well as the time information. And our model is designed to predict the saliency maps for the first two, and the scanpaths for the last one. Experimental results demonstrate the effectiveness of our model.

Journal ArticleDOI
04 Apr 2018-Neuron
TL;DR: It is proposed that visual-motion processing in V1 L6 is multisensory and contextually dependent on the motion status of the animal’s head.

Book ChapterDOI
Woojae Kim1, Jongyoo Kim2, Sewoong Ahn1, Jinwoo Kim1, Sanghoon Lee1 
08 Sep 2018
TL;DR: A novel full-reference VQA framework named Deep Video Quality Assessor (DeepVQA) is proposed to quantify the spatio-temporal visual perception via a convolutional neural network (CNN) and a convolved neural aggregation network (CNAN) and to manipulate the temporal variation of distortions.
Abstract: Incorporating spatio-temporal human visual perception into video quality assessment (VQA) remains a formidable issue. Previous statistical or computational models of spatio-temporal perception have limitations to be applied to the general VQA algorithms. In this paper, we propose a novel full-reference (FR) VQA framework named Deep Video Quality Assessor (DeepVQA) to quantify the spatio-temporal visual perception via a convolutional neural network (CNN) and a convolutional neural aggregation network (CNAN). Our framework enables to figure out the spatio-temporal sensitivity behavior through learning in accordance with the subjective score. In addition, to manipulate the temporal variation of distortions, we propose a novel temporal pooling method using an attention model. In the experiment, we show DeepVQA remarkably achieves the state-of-the-art prediction accuracy of more than 0.9 correlation, which is \(\sim \)5% higher than those of conventional methods on the LIVE and CSIQ video databases.

Journal ArticleDOI
17 Sep 2018
TL;DR: It is argued that research should focus on how color processing is adapted to the surface properties of objects in the natural environment in order to bridge the gap between the known early stages of color perception and the subjective appearance of color.
Abstract: Color has been scientifically investigated by linking color appearance to colorimetric measurements of the light that enters the eye. However, the main purpose of color perception is not to determine the properties of incident light, but to aid the visual perception of objects and materials in our environment. We review the state of the art on object colors, color constancy, and color categories to gain insight into the functional aspects of color perception. The common ground between these areas of research is that color appearance is tightly linked to the identification of objects and materials and the communication across observers. In conclusion, we argue that research should focus on how color processing is adapted to the surface properties of objects in the natural environment in order to bridge the gap between the known early stages of color perception and the subjective appearance of color.

Journal ArticleDOI
TL;DR: A group of phenomena such as vection and sensory reweighting are presented that provide information on how visual motion signals are used to maintain balance, taking into account the relationship between visual motion perception and balance control.
Abstract: Falls are the leading cause of accidental injury and death among older adults. One of three adults over the age of 65 years falls annually. As the size of elderly population increases, falls become a major concern for public health and there is a pressing need to understand the causes of falls thoroughly. While it is well documented that visual functions such as visual acuity, contrast sensitivity, and stereo acuity are correlated with fall risks, little attention has been paid to the relationship between falls and the ability of the visual system to perceive motion in the environment. The omission of visual motion perception in the literature is a critical gap because it is an essential function in maintaining balance. In the present article, we first review existing studies regarding visual risk factors for falls and the effect of ageing vision on falls. We then present a group of phenomena such as vection and sensory reweighting that provide information on how visual motion signals are used to maintain balance. We suggest that the current list of visual risk factors for falls should be elaborated by taking into account the relationship between visual motion perception and balance control.

Journal ArticleDOI
01 Jan 2018-Cortex
TL;DR: Focusing on the functions of the dorsal stream in the auditory and language system, this work tries to reconcile the various models of Where, How and When into one coherent concept of sensorimotor integration.

Journal ArticleDOI
TL;DR: This paper shows how a compact convolutional neural network (Compact-CNN), which only requires raw EEG signals for automatic feature extraction, can be used to decode signals from a 12-class SSVEP dataset without the need for user-specific calibration.
Abstract: Steady-State Visual Evoked Potentials (SSVEPs) are neural oscillations from the parietal and occipital regions of the brain that are evoked from flickering visual stimuli. SSVEPs are robust signals measurable in the electroencephalogram (EEG) and are commonly used in brain-computer interfaces (BCIs). However, methods for high-accuracy decoding of SSVEPs usually require hand-crafted approaches that leverage domain-specific knowledge of the stimulus signals, such as specific temporal frequencies in the visual stimuli and their relative spatial arrangement. When this knowledge is unavailable, such as when SSVEP signals are acquired asynchronously, such approaches tend to fail. In this paper, we show how a compact convolutional neural network (Compact-CNN), which only requires raw EEG signals for automatic feature extraction, can be used to decode signals from a 12-class SSVEP dataset without the need for any domain-specific knowledge or calibration data. We report across subject mean accuracy of approximately 80% (chance being 8.3%) and show this is substantially better than current state-of-the-art hand-crafted approaches using canonical correlation analysis (CCA) and Combined-CCA. Furthermore, we analyze our Compact-CNN to examine the underlying feature representation, discovering that the deep learner extracts additional phase and amplitude related features associated with the structure of the dataset. We discuss how our Compact-CNN shows promise for BCI applications that allow users to freely gaze/attend to any stimulus at any time (e.g., asynchronous BCI) as well as provides a method for analyzing SSVEP signals in a way that might augment our understanding about the basic processing in the visual cortex.

Journal ArticleDOI
21 Mar 2018-Neuron
TL;DR: This Perspective highlights a series of influential studies over the last five decades examining the role of the posterior parietal cortex in visual perception and motor planning and integrates long-standing views of PPC functions with more recent evidence to propose a more general model framework to explain integrative sensory, motor, and cognitive functions of the PPC.

Journal ArticleDOI
TL;DR: Testing the effects on perceptual abilities and visually evoked electroencephalography and fMRI responses found that detection sensitivity, discrimination accuracy, and subjective visibility change in accordance with noradrenaline (NE) levels, whereas decision bias is not affected.

Journal ArticleDOI
24 Apr 2018
TL;DR: Five stages of development for human V1 that start in infancy and continue across the life span are described, compared with visual and anatomical milestones, and implications for translating treatments for visual disorders that depend on neuroplasticity of V1 function are discussed.
Abstract: The primary visual cortex (V1) is the first cortical area that processes visual information. Normal development of V1 depends on binocular vision during the critical period, and age-related losses of vision are linked with neurobiological changes in V1. Animal studies have provided important details about the neurobiological mechanisms in V1 that support normal vision or are changed by visual diseases. There is very little information, however, about those neurobiological mechanisms in human V1. That lack of information has hampered the translation of biologically inspired treatments from preclinical models to effective clinical treatments. We have studied human V1 to characterize the expression of neurobiological mechanisms that regulate visual perception and neuroplasticity. We have identified five stages of development for human V1 that start in infancy and continue across the life span. Here, we describe these stages, compare them with visual and anatomical milestones, and discuss implications for translating treatments for visual disorders that depend on neuroplasticity of V1 function.

Journal ArticleDOI
TL;DR: Patients with chronic dysfunction following TBI may require occupational, vestibular, cognitive and other forms of physical therapy and benefit from visual rehabilitation, including reading‐related oculomotor training and the prescribing of spectacles with a variety of tints and prism combinations.
Abstract: Traumatic brain injury (TBI) and its associated concussion are major causes of disability and death. All ages can be affected but children, young adults and the elderly are particularly susceptible. A decline in mortality has resulted in many more individuals living with a disability caused by TBI including those affecting vision. This review describes: (1) the major clinical and pathological features of TBI; (2) the visual signs and symptoms associated with the disorder; and (3) discusses the assessment of quality of life and visual rehabilitation of the patient. Defects in primary vision such as visual acuity and visual fields, eye movement including vergence, saccadic and smooth pursuit movements, and in more complex aspects of vision involving visual perception, motion vision (‘akinopsia’), and visuo-spatial function have all been reported in TBI. Eye movement dysfunction may be an early sign of TBI. Hence, TBI can result in a variety of visual problems, many patients exhibiting multiple visual defects in combination with a decline in overall health. Patients with chronic dysfunction following TBI may require occupational, vestibular, cognitive and other forms of physical therapy. Such patients may also benefit from visual rehabilitation, including reading-related oculomotor training and the prescribing of spectacles with a variety of tints and prism combinations.

Journal ArticleDOI
TL;DR: The rhesus macaque superior colliculus, a structure instrumental for rapid visual exploration with saccades, detects low spatial frequencies, which are the most prevalent in natural scenes, much more rapidly than high spatial frequencies.
Abstract: Visual brain areas exhibit tuning characteristics well suited for image statistics present in our natural environment However, visual sensation is an active process, and if there are any brain areas that ought to be particularly in tune with natural scene statistics, it would be sensory-motor areas critical for guiding behavior Here we found that the rhesus macaque superior colliculus, a structure instrumental for rapid visual exploration with saccades, detects low spatial frequencies, which are the most prevalent in natural scenes, much more rapidly than high spatial frequencies Importantly, this accelerated detection happens independently of whether a neuron is more or less sensitive to low spatial frequencies to begin with At the population level, the superior colliculus additionally over-represents low spatial frequencies in neural response sensitivity, even at near-foveal eccentricities Thus, the superior colliculus possesses both temporal and response gain mechanisms for efficient gaze realignment in low-spatial-frequency-dominated natural environments

Journal ArticleDOI
TL;DR: This work used a novel visual “roving standard” paradigm to elicit mismatch responses in humans by unexpected changes in either color or emotional expression of faces and combined computational modeling and electroencephalography to test whether visual mismatch responses reflected trial-by-trial pwPEs.
Abstract: Predictive coding (PC) posits that the brain uses a generative model to infer the environmental causes of its sensory data and uses precision-weighted prediction errors (pwPEs) to continuously update this model. While supported by much circumstantial evidence, experimental tests grounded in formal trial-by-trial predictions are rare. One partial exception is event-related potential (ERP) studies of the auditory mismatch negativity (MMN), where computational models have found signatures of pwPEs and related model-updating processes. Here, we tested this hypothesis in the visual domain, examining possible links between visual mismatch responses and pwPEs. We used a novel visual "roving standard" paradigm to elicit mismatch responses in humans (of both sexes) by unexpected changes in either color or emotional expression of faces. Using a hierarchical Bayesian model, we simulated pwPE trajectories of a Bayes-optimal observer and used these to conduct a comprehensive trial-by-trial analysis across the time × sensor space. We found significant modulation of brain activity by both color and emotion pwPEs. The scalp distribution and timing of these single-trial pwPE responses were in agreement with visual mismatch responses obtained by traditional averaging and subtraction (deviant-minus-standard) approaches. Finally, we compared the Bayesian model to a more classical change model of MMN. Model comparison revealed that trial-wise pwPEs explained the observed mismatch responses better than categorical change detection. Our results suggest that visual mismatch responses reflect trial-wise pwPEs, as postulated by PC. These findings go beyond classical ERP analyses of visual mismatch and illustrate the utility of computational analyses for studying automatic perceptual processes.SIGNIFICANCE STATEMENT Human perception is thought to rely on a predictive model of the environment that is updated via precision-weighted prediction errors (pwPEs) when events violate expectations. This "predictive coding" view is supported by studies of the auditory mismatch negativity brain potential. However, it is less well known whether visual perception of mismatch relies on similar processes. Here we combined computational modeling and electroencephalography to test whether visual mismatch responses reflected trial-by-trial pwPEs. Applying a Bayesian model to series of face stimuli that violated expectations about color or emotional expression, we found significant modulation of brain activity by both color and emotion pwPEs. A categorical change detection model performed less convincingly. Our findings support the predictive coding interpretation of visual mismatch responses.

Journal ArticleDOI
TL;DR: An efficient and robust computational framework to perform Bayesian model comparison of causal inference strategies, which incorporates a number of alternative assumptions about the observers, and investigates whether human observers’ performance in an explicit cause attribution and an implicit heading discrimination task can be modeled as a causal inference process.
Abstract: The precision of multisensory perception improves when cues arising from the same cause are integrated, such as visual and vestibular heading cues for an observer moving through a stationary environment. In order to determine how the cues should be processed, the brain must infer the causal relationship underlying the multisensory cues. In heading perception, however, it is unclear whether observers follow the Bayesian strategy, a simpler non-Bayesian heuristic, or even perform causal inference at all. We developed an efficient and robust computational framework to perform Bayesian model comparison of causal inference strategies, which incorporates a number of alternative assumptions about the observers. With this framework, we investigated whether human observers’ performance in an explicit cause attribution and an implicit heading discrimination task can be modeled as a causal inference process. In the explicit causal inference task, all subjects accounted for cue disparity when reporting judgments of common cause, although not necessarily all in a Bayesian fashion. By contrast, but in agreement with previous findings, data from the heading discrimination task only could not rule out that several of the same observers were adopting a forced-fusion strategy, whereby cues are integrated regardless of disparity. Only when we combined evidence from both tasks we were able to rule out forced-fusion in the heading discrimination task. Crucially, findings were robust across a number of variants of models and analyses. Our results demonstrate that our proposed computational framework allows researchers to ask complex questions within a rigorous Bayesian framework that accounts for parameter and model uncertainty.

Journal ArticleDOI
TL;DR: Converging empirical evidence points to a seeming paradox: crowding happens at multiple levels, which would seem to impair object recognition, and yet visual information at each of those levels is maintained intact and influences subsequent higher-level visual processing.

Journal ArticleDOI
TL;DR: A neural signature of serial dependence is demonstrated in numerosity perception emerging early in the visual processing stream even in the absence of an explicit task, which is consistent with the view that these biases smooth out noise from neural signals to establish perceptual continuity.
Abstract: Attractive serial dependence refers to an adaptive change in the representation of sensory information, whereby a current stimulus appears to be similar to a previous one. The nature of this phenomenon is controversial, however, as serial dependence could arise from biased perceptual representations or from biased traces of working memory representation at a decisional stage. Here, we demonstrated a neural signature of serial dependence in numerosity perception emerging early in the visual processing stream even in the absence of an explicit task. Furthermore, a psychophysical experiment revealed that numerosity perception is biased by a previously presented stimulus in an attractive way, not by repulsive adaptation. These results suggest that serial dependence is a perceptual phenomenon starting from early levels of visual processing and occurring independently from a decision process, which is consistent with the view that these biases smooth out noise from neural signals to establish perceptual continuity.