Understanding and Predicting Image Memorability at a Large Scale

doi:10.1109/ICCV.2015.275

Open AccessProceedings ArticleDOI

Understanding and Predicting Image Memorability at a Large Scale

Aditya Khosla, +3 more

- pp 2390-2398

Chats0

TLDR

LaMem is built, the largest annotated image memorability dataset to date, using Convolutional Neural Networks, to demonstrate that one can now robustly estimate the memorability of images from many different classes, positioning memorability and deep memorability features as prime candidates to estimate the utility of information for cognitive systems.

Abstract:

Progress in estimating visual memorability has been limited by the small scale and lack of variety of benchmark data. Here, we introduce a novel experimental procedure to objectively measure human memory, allowing us to build LaMem, the largest annotated image memorability dataset to date (containing 60,000 images from diverse sources). Using Convolutional Neural Networks (CNNs), we show that fine-tuned deep features outperform all other features by a large margin, reaching a rank correlation of 0.64, near human consistency (0.68). Analysis of the responses of the high-level CNN layers shows which objects and regions are positively, and negatively, correlated with memorability, allowing us to create memorability maps for each image and provide a concrete method to perform image memorability manipulation. This work demonstrates that one can now robustly estimate the memorability of images from many different classes, positioning memorability and deep memorability features as prime candidates to estimate the utility of information for cognitive systems. Our model and data are available at: http://memorability.csail.mit.edu.

Citations

PDF

Open Access

More filters

Posted Content

Eye Tracking for Everyone

Kyle Krafka, +6 more

- 18 Jun 2016 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: iTracker, a convolutional neural network for eye tracking, is trained, which achieves a significant reduction in error over previous approaches while running in real time (10-15fps) on a modern mobile device.

...read moreread less

Proceedings ArticleDOI

Eye Tracking for Everyone

Kyle Krafka, +6 more

TL;DR: Gaze Capture as mentioned in this paper is the first large-scale dataset for eye tracking, containing data from over 1450 people consisting of almost 2:5M frames and trained iTracker, a convolutional neural network, which achieves a significant reduction in error over previous approaches while running in real time (10-15fps) on a modern mobile device.

...read moreread less

Journal ArticleDOI

Fine-tuning Convolutional Neural Networks for fine art classification

Eva Cetinic, +2 more

- 30 Dec 2018 -

Expert Systems With Applications

TL;DR: It is shown that features derived from fine-tuned networks can be employed to retrieve images similar in either style or content, which can be used to enhance capabilities of search systems in different online art collections.

...read moreread less

Lore Goetschalckx, Alex Andonian, Aude Oliva, Phillip Isola: GANalyze: Toward Visual Definitions of Cognitive Image Properties.

Lore Goetschalckx, +3 more

TL;DR: In this article, a framework that uses Generative Adversarial Networks (GANs) to study cognitive properties like memorability is introduced, where GANs allow to generate a manifold of natural-looking images with fine-grained differences in their visual attributes.

...read moreread less

Proceedings ArticleDOI

GANalyze: Toward Visual Definitions of Cognitive Image Properties

Lore Goetschalckx, +3 more

TL;DR: A framework that uses Generative Adversarial Networks (GANs) to study cognitive properties like memorability is introduced and it is demonstrated that the same framework can be used to analyze image aesthetics and emotional valence.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings ArticleDOI

Histograms of oriented gradients for human detection

Navneet Dalal, +1 more

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

Jonathan Long, +2 more

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less

Proceedings ArticleDOI

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Ross Girshick, +3 more

TL;DR: RCNN as discussed by the authors combines CNNs with bottom-up region proposals to localize and segment objects, and when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost.

...read moreread less

Collapse

Proceedings of the National Academy of S...

SUN database: Large-scale scene recognition from abbey to zoo

Jianxiong Xiao, +4 more

Understanding and Predicting Image Memorability at a Large Scale

Citations

Eye Tracking for Everyone

Eye Tracking for Everyone

Fine-tuning Convolutional Neural Networks for fine art classification

Lore Goetschalckx, Alex Andonian, Aude Oliva, Phillip Isola: GANalyze: Toward Visual Definitions of Cognitive Image Properties.

GANalyze: Toward Visual Definitions of Cognitive Image Properties

References

ImageNet Classification with Deep Convolutional Neural Networks

Histograms of oriented gradients for human detection

ImageNet Large Scale Visual Recognition Challenge

Fully convolutional networks for semantic segmentation

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

Related Papers (5)

ImageNet Classification with Deep Convolutional Neural Networks

Deep Residual Learning for Image Recognition

Learning Deep Features for Scene Recognition using Places Database

Visual long-term memory has a massive storage capacity for object details

SUN database: Large-scale scene recognition from abbey to zoo