scispace - formally typeset
T

Timothy P. Lillicrap

Researcher at University College London

Publications -  176
Citations -  68729

Timothy P. Lillicrap is an academic researcher from University College London. The author has contributed to research in topics: Reinforcement learning & Artificial neural network. The author has an hindex of 66, co-authored 160 publications receiving 49666 citations. Previous affiliations of Timothy P. Lillicrap include Google & Queen's University.

Papers
More filters
Journal ArticleDOI

Mastering the game of Go with deep neural networks and tree search

TL;DR: Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.
Journal ArticleDOI

Mastering the game of Go without human knowledge

TL;DR: An algorithm based solely on reinforcement learning is introduced, without human data, guidance or domain knowledge beyond game rules, that achieves superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.
Proceedings Article

Asynchronous methods for deep reinforcement learning

TL;DR: A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.
Posted Content

Continuous control with deep reinforcement learning

TL;DR: This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Proceedings Article

Matching networks for one shot learning

TL;DR: In this paper, a network that maps a small labeled support set and an unlabeled example to its label obviates the need for fine-tuning to adapt to new class types.