Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

Open AccessPosted Content

Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm

David Silver, +12 more

- 05 Dec 2017 -

arXiv: Artificial Intelligence

Chats0

TLDR

This paper generalises the approach into a single AlphaZero algorithm that can achieve, tabula rasa, superhuman performance in many challenging domains, and convincingly defeated a world-champion program in each case.

Abstract:

The game of chess is the most widely-studied domain in the history of artificial intelligence. The strongest programs are based on a combination of sophisticated search techniques, domain-specific adaptations, and handcrafted evaluation functions that have been refined by human experts over several decades. In contrast, the AlphaGo Zero program recently achieved superhuman performance in the game of Go, by tabula rasa reinforcement learning from games of self-play. In this paper, we generalise this approach into a single AlphaZero algorithm that can achieve, tabula rasa, superhuman performance in many challenging domains. Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi (Japanese chess) as well as Go, and convincingly defeated a world-champion program in each case.

Citations

PDF

Open Access

More filters

Book

Neural Networks and Deep Learning

Charu C. Aggarwal

Proceedings ArticleDOI

Training language models to follow instructions with human feedback

Long Ouyang, +19 more

TL;DR: The results show that fine-tuning with human feedback is a promising direction for aligning language models with human intent and showing improvements in truthfulness and reductions in toxic output generation while having minimal performance regressions on public NLP datasets.

...read moreread less

Journal ArticleDOI

Adversarial Examples: Attacks and Defenses for Deep Learning

Xiaoyong Yuan, +3 more

- 14 Jan 2019 -

IEEE Transactions on Neural Networks

TL;DR: In this paper, the authors review recent findings on adversarial examples for DNNs, summarize the methods for generating adversarial samples, and propose a taxonomy of these methods.

...read moreread less

Posted Content

Solving Rubik's Cube with a Robot Hand.

OpenAI, +18 more

- 16 Oct 2019 -

arXiv: Learning

TL;DR: It is demonstrated that models trained only in simulation can be used to solve a manipulation problem of unprecedented complexity on a real robot, made possible by a novel algorithm, which is called automatic domain randomization (ADR), and a robot platform built for machine learning.

...read moreread less

Journal ArticleDOI

Machine learning & artificial intelligence in the quantum domain: a review of recent progress.

Vedran Dunjko, +2 more

- 19 Jun 2018 -

Reports on Progress in Physics

TL;DR: In this article, the authors describe the main ideas, recent developments and progress in a broad spectrum of research investigating ML and AI in the quantum domain, and discuss the fundamental issue of quantum generalizations of learning and AI concepts.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Mastering the game of Go without human knowledge

David Silver, +16 more

- 19 Oct 2017 -

Nature

TL;DR: An algorithm based solely on reinforcement learning is introduced, without human data, guidance or domain knowledge beyond game rules, that achieves superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.

...read moreread less

Posted Content

In-Datacenter Performance Analysis of a Tensor Processing Unit

Norman P. Jouppi, +74 more

- 16 Apr 2017 -

arXiv: Hardware Architecture

TL;DR: This paper evaluates a custom ASIC-called a Tensor Processing Unit (TPU)-deployed in datacenters since 2015 that accelerates the inference phase of neural networks (NN) and compares it to a server-class Intel Haswell CPU and an Nvidia K80 GPU, which are contemporaries deployed in the samedatacenters.

...read moreread less

Journal ArticleDOI

Deep Blue

Murray Campbell, +2 more

TL;DR: Deep Blue as discussed by the authors is the chess machine that defeated then-reigning World Chess Champion Garry Kasparov in a six-game match in 1997 and won the first World Chess Championship.

...read moreread less

Journal ArticleDOI

An analysis of alpha-beta pruning

Donald E. Knuth, +1 more

- 01 Dec 1975 -

Artificial Intelligence

TL;DR: The alpha-beta procedure for searching game trees is shown to be optimal in a certain sense, and bounds are obtained for its running time with various kinds of random data.

...read moreread less