Shortcut Learning in Deep Neural Networks

doi:10.1038/S42256-020-00257-Z

Open AccessJournal ArticleDOI

Shortcut Learning in Deep Neural Networks

Robert Geirhos, +6 more

- 16 Apr 2020 -

arXiv: Computer Vision and Pattern Recog...

Chats0

TLDR

In this paper, a set of recommendations for model interpretation and benchmarking, highlighting recent advances in machine learning to improve robustness and transferability from the lab to real-world applications, are presented.

Abstract:

Deep learning has triggered the current rise of artificial intelligence and is the workhorse of today's machine intelligence. Numerous success stories have rapidly spread all over science, industry and society, but its limitations have only recently come into focus. In this perspective we seek to distil how many of deep learning's problem can be seen as different symptoms of the same underlying problem: shortcut learning. Shortcuts are decision rules that perform well on standard benchmarks but fail to transfer to more challenging testing conditions, such as real-world scenarios. Related issues are known in Comparative Psychology, Education and Linguistics, suggesting that shortcut learning may be a common characteristic of learning systems, biological and artificial alike. Based on these observations, we develop a set of recommendations for model interpretation and benchmarking, highlighting recent advances in machine learning to improve robustness and transferability from the lab to real-world applications.

Citations

PDF

Open Access

More filters

Posted Content

The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

Dan Hendrycks, +12 more

- 29 Jun 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: It is found that using larger models and artificial data augmentations can improve robustness on real-world distribution shifts, contrary to claims in prior work.

...read moreread less

Posted Content

Natural Adversarial Examples

Dan Hendrycks, +4 more

- 16 Jul 2019 -

arXiv: Learning

TL;DR: This work introduces two challenging datasets that reliably cause machine learning model performance to substantially degrade and curates an adversarial out-of-distribution detection dataset called IMAGENET-O, which is the first out- of-dist distribution detection dataset created for ImageNet models.

...read moreread less

Posted Content

Learning Transferable Visual Models From Natural Language Supervision

Alec Radford, +11 more

- 26 Feb 2021 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: In this article, a pre-training task of predicting which caption goes with which image is used to learn SOTA image representations from scratch on a dataset of 400 million (image, text) pairs collected from the internet.

...read moreread less

Posted Content

Data and its (dis)contents: A survey of dataset development and use in machine learning research

Amandalynne Paullada, +4 more

- 09 Dec 2020 -

arXiv: Learning

TL;DR: A breadth of literature that has revealed the limitations of predominant practices for dataset collection and use in the field of machine learning is surveyed and advocate for the use of both qualitative and quantitative approaches to more carefully document and analyze datasets during the creation and usage phases is advocated.

...read moreread less

Posted Content

Improving robustness against common corruptions by covariate shift adaptation

Steffen Schneider, +5 more

- 30 Jun 2020 -

arXiv: Learning

TL;DR: It is argued that results with adapted statistics should be included whenever reporting scores in corruption benchmarks and other out-of-distribution generalization settings, and 32 samples are sufficient to improve the current state of the art for a ResNet-50 architecture.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Proceedings ArticleDOI

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

Journal ArticleDOI

Multilayer feedforward networks are universal approximators

Kurt Hornik, +2 more

- 01 Jul 1989 -

Neural Networks

TL;DR: It is rigorously established that standard multilayer feedforward networks with as few as one hidden layer using arbitrary squashing functions are capable of approximating any Borel measurable function from one finite dimensional space to another to any desired degree of accuracy, provided sufficiently many hidden units are available.

...read moreread less