Timothy P. Lillicrap

Journal ArticleDOI

Mastering the game of Go with deep neural networks and tree search

- 28 Jan 2016 -

TL;DR: Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.

...read moreread less

Journal ArticleDOI

Mastering the game of Go without human knowledge

David Silver, +16 more

- 19 Oct 2017 -

Nature

TL;DR: An algorithm based solely on reinforcement learning is introduced, without human data, guidance or domain knowledge beyond game rules, that achieves superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.

...read moreread less

Proceedings Article

Asynchronous methods for deep reinforcement learning

Volodymyr Mnih, +7 more

TL;DR: A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.

...read moreread less

Posted Content

Continuous control with deep reinforcement learning

Timothy P. Lillicrap, +7 more

- 09 Sep 2015 -

arXiv: Learning

TL;DR: This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.

...read moreread less

Proceedings Article

Matching networks for one shot learning

Oriol Vinyals, +4 more

TL;DR: In this paper, a network that maps a small labeled support set and an unlabeled example to its label obviates the need for fine-tuning to adapt to new class types.

...read moreread less

Papers

Mastering the game of Go with deep neural networks and tree search

Mastering the game of Go without human knowledge

Asynchronous methods for deep reinforcement learning

Continuous control with deep reinforcement learning

Matching networks for one shot learning