Search or ask a question

Showing papers by "Geoffrey E. Hinton published in 2021"

PDF

Open Access

Journal Article•DOI•

Deep learning for AI

[...]

Yoshua Bengio¹, Yann LeCun², Geoffrey E. Hinton³•Institutions (3)

Université de Montréal¹, New York University², University of Toronto³

21 Jun 2021-Communications of The ACM

TL;DR: In this paper, neural networks are used to learn the rich internal representations required for difficult tasks such as recognizing objects or understanding language, which can be used to classify objects or understand language.

...read moreread less

Abstract: How can neural networks learn the rich internal representations required for difficult tasks such as recognizing objects or understanding language?

...read moreread less

294 citations

Posted Content•

How to represent part-whole hierarchies in a neural network

[...]

Geoffrey E. Hinton

25 Feb 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: The authors proposed a system called GLOM, which combines transformers, neural fields, contrastive representation learning, and capsules. But GLOM does not describe a working system, only a single idea about representation which allows advances made by several different groups to be combined into an imaginary system.

...read moreread less

Abstract: This paper does not describe a working system. Instead, it presents a single idea about representation which allows advances made by several different groups to be combined into an imaginary system called GLOM. The advances include transformers, neural fields, contrastive representation learning, distillation and capsules. GLOM answers the question: How can a neural network with a fixed architecture parse an image into a part-whole hierarchy which has a different structure for each image? The idea is simply to use islands of identical vectors to represent the nodes in the parse tree. If GLOM can be made to work, it should significantly improve the interpretability of the representations produced by transformer-like systems when applied to vision or language

...read moreread less

39 citations

Posted Content•

Pix2seq: A Language Modeling Framework for Object Detection

[...]

Ting Chen¹, Saurabh Saxena, Lala Li, David J. Fleet, Geoffrey E. Hinton - Show less +1 more•Institutions (1)

Google¹

22 Sep 2021-arXiv: Computer Vision and Pattern Recognition

TL;DR: Pix2Seq as mentioned in this paper cast object detection as a language modeling task conditioned on the observed pixel inputs, where object descriptions (e.g., bounding boxes and class labels) are expressed as sequences of discrete tokens and train a neural network to perceive the image and generate the desired sequence.

...read moreread less

Abstract: This paper presents Pix2Seq, a simple and generic framework for object detection. Unlike existing approaches that explicitly integrate prior knowledge about the task, we simply cast object detection as a language modeling task conditioned on the observed pixel inputs. Object descriptions (e.g., bounding boxes and class labels) are expressed as sequences of discrete tokens, and we train a neural net to perceive the image and generate the desired sequence. Our approach is based mainly on the intuition that if a neural net knows about where and what the objects are, we just need to teach it how to read them out. Beyond the use of task-specific data augmentations, our approach makes minimal assumptions about the task, yet it achieves competitive results on the challenging COCO dataset, compared to highly specialized and well optimized detection algorithms.

...read moreread less

4 citations

Proceedings Article•

Unsupervised Part Representation by Flow Capsules

[...]

Sara Sabour Rouh Aghdam¹, Andrea Tagliasacchi¹, Soroosh Yazdani¹, Geoffrey E. Hinton¹, David J. Fleet² - Show less +1 more•Institutions (2)

Google¹, University of Toronto²

18 Jul 2021

1 citations

Proceedings Article•

Teaching with Commentaries

[...]

Aniruddh Raghu¹, Maithra Raghu², Simon Kornblith³, David Duvenaud⁴, Geoffrey E. Hinton⁴ - Show less +1 more•Institutions (4)

Massachusetts Institute of Technology¹, Cornell University², Google³, University of Toronto⁴

03 May 2021

TL;DR: This paper propose a flexible teaching framework using commentaries, meta-learned information helpful for training on a particular task or dataset, and explore diverse applications of commentaries from learning weights for individual training examples, to parameterising label dependent data augmentation policies, to representing attention masks that highlight salient image regions.

...read moreread less

Abstract: Effective training of deep neural networks can be challenging, and there remain many open questions on how to best learn these models. Recently developed methods to improve neural network training examine teaching: providing learned information during the training process to improve downstream model performance. In this paper, we take steps towards extending the scope of teaching. We propose a flexible teaching framework using commentaries, meta-learned information helpful for training on a particular task or dataset. We present an efficient and scalable gradient-based method to learn commentaries, leveraging recent work on implicit differentiation. We explore diverse applications of commentaries, from learning weights for individual training examples, to parameterising label-dependent data augmentation policies, to representing attention masks that highlight salient image regions. In these settings, we find that commentaries can improve training speed and/or performance and also provide fundamental insights about the dataset and training process.

...read moreread less