scispace - formally typeset
Open AccessJournal ArticleDOI

Mapping human visual representations in space and time by neural networks

TLDR
CNNs are a promising formal model of human visual object recognition Combined with fMRI and MEG, they provide an integrated spatiotemporal and algorithmically explicit view of the first few hundred milliseconds of object recognition.
Abstract
The neural machinery underlying visual object recognition comprises a hierarchy of cortical regions in the ventral visual stream. The spatiotemporal dynamics of information flow in this hierarchy of regions is largely unknown. Here we tested the hypothesis that there is a correspondence between the spatiotemporal neural processes in the human brain and the layer hierarchy of a deep convolutional neural network (CNN). We presented 118 images of real-world objects to human participants (N=15) while we measured their brain activity with functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG). We trained an 8 layer (5 convolutional layers, 3 fully connected layers) CNN to predict 683 object categories with 900K training images from the ImageNet dataset. We obtained layer-specific CNN responses to the same 118 images. To compare brain-imaging data with the CNN in a common framework, we used representational similarity analysis. The key idea is that if two conditions evoke similar patterns in brain imaging data, they should also evoke similar patterns in the computer model. We thus determined 'where' (fMRI) and 'when' (MEG) the CNNs predicted brain activity. We found a correspondence in hierarchy between cortical regions, processing time, and CNN layers. Low CNN layers predicted MEG activity early and high layers relatively later; low CNN layers predicted fMRI activity in early visual regions, and high layers in late visual regions. Surprisingly, the correspondence between CNN layer hierarchy and cortical regions held for the ventral and dorsal visual stream. Results were dependent on amount of training and type of training material. Our results show that CNNs are a promising formal model of human visual object recognition. Combined with fMRI and MEG, they provide an integrated spatiotemporal and algorithmically explicit view of the first few hundred milliseconds of object recognition. Meeting abstract presented at VSS 2015.

read more

Citations
More filters
Proceedings ArticleDOI

Shallow and Deep Convolutional Networks for Saliency Prediction

TL;DR: In this paper, the authors proposed a completely data-driven approach by training a convolutional neural network (convnet) for saliency prediction, where the learning process is formulated as a minimization of a loss function that measures the Euclidean distance of the predicted saliency map with the provided ground truth.
Posted Content

Shallow and Deep Convolutional Networks for Saliency Prediction

TL;DR: This paper addresses the problem with a completely data-driven approach by training a convolutional neural network (convnet) and proposes two designs: a shallow convnet trained from scratch, and a another deeper solution whose first three layers are adapted from another network trained for classification.
Journal ArticleDOI

Spatiotemporal visual saliency guided perceptual high efficiency video coding with neural network

TL;DR: A hybrid compression algorithm that uses the deep convolutional neural network to compute the spatial saliency followed by extraction of the temporal saliency from the compressed-domain motion information and a rate distortion calculation method is proposed to choose the pattern and guide the allocation of bits during the video compression process.
Journal ArticleDOI

A Deep Learning Approach for Breast Cancer Mass Detection

TL;DR: The pre- trained ResNet-50 architecture and the Class Activation Map (CAM) technique are employed in breast cancer classification and localization respectively, and it is worth noting that the pre-trained CNN is able automatically to learn the most discriminative features in the mammogram, and then fulfills superior results in Breast cancer classification (normal or mass).
Journal ArticleDOI

High-Definition Video Compression System Based on Perception Guidance of Salient Information of a Convolutional Neural Network and HEVC Compression Domain

TL;DR: A more flexible QP selection method, which selects its corresponding QP according to the saliency value of CU and a new rate-distortion optimization algorithm, which integrates the current block’s saliency feature into the traditional rate- Distortion calculation method, to guide the allocation of bits and achieve the purpose of perception priority.
Related Papers (5)