Mapping human visual representations in space and time by neural networks

doi:10.1167/15.12.376

Open AccessJournal ArticleDOI

Mapping human visual representations in space and time by neural networks

Radoslaw Martin Cichy, +4 more

- 01 Sep 2015 -

Journal of Vision

- Vol. 15, Iss: 12, pp 376-376

TLDR

CNNs are a promising formal model of human visual object recognition Combined with fMRI and MEG, they provide an integrated spatiotemporal and algorithmically explicit view of the first few hundred milliseconds of object recognition.

Abstract:

The neural machinery underlying visual object recognition comprises a hierarchy of cortical regions in the ventral visual stream. The spatiotemporal dynamics of information flow in this hierarchy of regions is largely unknown. Here we tested the hypothesis that there is a correspondence between the spatiotemporal neural processes in the human brain and the layer hierarchy of a deep convolutional neural network (CNN). We presented 118 images of real-world objects to human participants (N=15) while we measured their brain activity with functional magnetic resonance imaging (fMRI) and magnetoencephalography (MEG). We trained an 8 layer (5 convolutional layers, 3 fully connected layers) CNN to predict 683 object categories with 900K training images from the ImageNet dataset. We obtained layer-specific CNN responses to the same 118 images. To compare brain-imaging data with the CNN in a common framework, we used representational similarity analysis. The key idea is that if two conditions evoke similar patterns in brain imaging data, they should also evoke similar patterns in the computer model. We thus determined 'where' (fMRI) and 'when' (MEG) the CNNs predicted brain activity. We found a correspondence in hierarchy between cortical regions, processing time, and CNN layers. Low CNN layers predicted MEG activity early and high layers relatively later; low CNN layers predicted fMRI activity in early visual regions, and high layers in late visual regions. Surprisingly, the correspondence between CNN layer hierarchy and cortical regions held for the ventral and dorsal visual stream. Results were dependent on amount of training and type of training material. Our results show that CNNs are a promising formal model of human visual object recognition. Combined with fMRI and MEG, they provide an integrated spatiotemporal and algorithmically explicit view of the first few hundred milliseconds of object recognition. Meeting abstract presented at VSS 2015.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Shallow and Deep Convolutional Networks for Saliency Prediction

Junting Pan, +4 more

TL;DR: In this paper, the authors proposed a completely data-driven approach by training a convolutional neural network (convnet) for saliency prediction, where the learning process is formulated as a minimization of a loss function that measures the Euclidean distance of the predicted saliency map with the provided ground truth.

...read moreread less

Posted Content

Shallow and Deep Convolutional Networks for Saliency Prediction

Junting Pan, +4 more

- 02 Mar 2016 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This paper addresses the problem with a completely data-driven approach by training a convolutional neural network (convnet) and proposes two designs: a shallow convnet trained from scratch, and a another deeper solution whose first three layers are adapted from another network trained for classification.

...read moreread less

Journal ArticleDOI

Spatiotemporal visual saliency guided perceptual high efficiency video coding with neural network

Shiping Zhu, +1 more

- 31 Jan 2018 -

Neurocomputing

TL;DR: A hybrid compression algorithm that uses the deep convolutional neural network to compute the spatial saliency followed by extraction of the temporal saliency from the compressed-domain motion information and a rate distortion calculation method is proposed to choose the pattern and guide the allocation of bits during the video compression process.

...read moreread less

Journal ArticleDOI

A Deep Learning Approach for Breast Cancer Mass Detection

Wael E. Fathy, +1 more

- 01 Jan 2019 -

International Journal of Advanced Comput...

TL;DR: The pre- trained ResNet-50 architecture and the Class Activation Map (CAM) technique are employed in breast cancer classification and localization respectively, and it is worth noting that the pre-trained CNN is able automatically to learn the most discriminative features in the mammogram, and then fulfills superior results in Breast cancer classification (normal or mass).

...read moreread less

Journal ArticleDOI

High-Definition Video Compression System Based on Perception Guidance of Salient Information of a Convolutional Neural Network and HEVC Compression Domain

Shiping Zhu, +2 more

- 01 Jul 2020 -

IEEE Transactions on Circuits and System...

TL;DR: A more flexible QP selection method, which selects its corresponding QP according to the saliency value of CU and a new rate-distortion optimization algorithm, which integrates the current block’s saliency feature into the traditional rate- Distortion calculation method, to guide the allocation of bits and achieve the purpose of perception priority.

...read moreread less