Christopher Hesse

Proceedings Article

Language Models are Few-Shot Learners

TL;DR: GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.

...read moreread less

Posted Content

Language Models are Few-Shot Learners

Tom B. Brown, +30 more

- 28 May 2020 -

arXiv: Computation and Language

TL;DR: This article showed that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches.

...read moreread less

Posted Content

Dota 2 with Large Scale Deep Reinforcement Learning

Christopher Berner, +24 more

- 01 Jan 2019 -

arXiv: Learning

TL;DR: By defeating the Dota 2 world champion (Team OG), OpenAI Five demonstrates that self-play reinforcement learning can achieve superhuman performance on a difficult task.

...read moreread less

Proceedings Article

Quantifying Generalization in Reinforcement Learning

Karl Cobbe, +4 more

TL;DR: It is shown that deeper convolutional architectures improve generalization, as do methods traditionally found in supervised learning, including L2 regularization, dropout, data augmentation and batch normalization.

...read moreread less

Proceedings Article

Leveraging Procedural Generation to Benchmark Reinforcement Learning

Karl Cobbe, +3 more

TL;DR: This work empirically demonstrate that diverse environment distributions are essential to adequately train and evaluate RL agents, thereby motivating the extensive use of procedural content generation and uses this benchmark to investigate the effects of scaling model size.

...read moreread less

Papers

Language Models are Few-Shot Learners

Language Models are Few-Shot Learners

Dota 2 with Large Scale Deep Reinforcement Learning

Quantifying Generalization in Reinforcement Learning

Leveraging Procedural Generation to Benchmark Reinforcement Learning