Yutian Chen

Journal ArticleDOI

Mastering the game of Go without human knowledge

- 19 Oct 2017 -

TL;DR: An algorithm based solely on reinforcement learning is introduced, without human data, guidance or domain knowledge beyond game rules, that achieves superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.

...read moreread less

Journal ArticleDOI

A Generalist Agent

S Reed, +19 more

TL;DR: Gato as mentioned in this paper is a generalist agent that can play Atari, caption images, chat, stack blocks with a real robot arm and much more, deciding based on its context whether to output text, joint torques, button presses, or other tokens.

...read moreread less

Proceedings Article

Learning to Learn without Gradient Descent by Gradient Descent

Yutian Chen, +6 more

TL;DR: It is shown that recurrent neural network optimizers trained on simple synthetic functions by gradient descent exhibit a remarkable degree of transfer in that they can be used to efficiently optimize a broad range of derivative-free black-box functions, including Gaussian process bandits, simple control objectives, global optimization benchmarks and hyper-parameter tuning tasks.

...read moreread less

Proceedings Article

Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget

Anoop Korattikara, +2 more

TL;DR: In this paper, an approximate MH rule based on a sequential hypothesis test was proposed to accept or reject samples with high confidence using only a fraction of the data required for the exact MH rule.

...read moreread less

Proceedings Article

Super-samples from kernel herding

Yutian Chen, +2 more

TL;DR: The kernel herding algorithm as mentioned in this paper is an infinite memory deterministic process that learns to approximate a PDF with a collection of samples, and it has been shown that it decreases the error of expectations of functions in the Hilbert space at a rate O(1/T) which is much faster than the usual O(T) for iid random samples.

...read moreread less

Papers

Mastering the game of Go without human knowledge

A Generalist Agent

Learning to Learn without Gradient Descent by Gradient Descent

Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget

Super-samples from kernel herding