Mastering the game of Go with deep neural networks and tree search

doi:10.1038/NATURE16961

Journal ArticleDOI

Mastering the game of Go with deep neural networks and tree search

David Silver, +19 more

- 28 Jan 2016 -

Nature

- Vol. 529, Iss: 7587, pp 484-489

TLDR

Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.

Abstract:

The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of stateof-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Dermatologist-level classification of skin cancer with deep neural networks

Andre Esteva, +7 more

- 02 Feb 2017 -

Nature

TL;DR: This work demonstrates an artificial intelligence capable of classifying skin cancer with a level of competence comparable to dermatologists, trained end-to-end from images directly, using only pixels and disease labels as inputs.

...read moreread less

Journal ArticleDOI

Mastering the game of Go without human knowledge

David Silver, +16 more

- 19 Oct 2017 -

Nature

TL;DR: An algorithm based solely on reinforcement learning is introduced, without human data, guidance or domain knowledge beyond game rules, that achieves superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.

...read moreread less

Proceedings ArticleDOI

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

Ramprasaath R. Selvaraju, +5 more

TL;DR: This work combines existing fine-grained visualizations to create a high-resolution class-discriminative visualization, Guided Grad-CAM, and applies it to image classification, image captioning, and visual question answering (VQA) models, including ResNet-based architectures.

...read moreread less

Proceedings ArticleDOI

Towards Evaluating the Robustness of Neural Networks

Nicholas Carlini, +1 more

TL;DR: In this paper, the authors demonstrate that defensive distillation does not significantly increase the robustness of neural networks by introducing three new attack algorithms that are successful on both distilled and undistilled neural networks with 100% probability.

...read moreread less

Journal ArticleDOI

Places: A 10 Million Image Database for Scene Recognition

Bolei Zhou, +4 more

- 01 Jun 2018 -

IEEE Transactions on Pattern Analysis an...

TL;DR: The Places Database is described, a repository of 10 million scene photographs, labeled with scene semantic categories, comprising a large and diverse list of the types of environments encountered in the world, using the state-of-the-art Convolutional Neural Networks as baselines, that significantly outperform the previous approaches.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book ChapterDOI

A lock-free multithreaded monte-carlo tree search algorithm

Markus Enzenberger, +1 more

TL;DR: A new lock-free parallel algorithm for Monte-Carlo tree search which takes advantage of the memory model of the IA-32 and Intel-64 CPU architectures and intentionally ignores rare faulty updates of node values is presented.

...read moreread less

Book ChapterDOI

Monte-Carlo simulation balancing in practice

Shih-Chieh Huang, +2 more

TL;DR: The effectiveness of simulation balancing is demonstrated in a more realistic setting and a state-of-the-art program, Erica, learned an improved playout policy on the 9×9 board, without requiring any external expert to provide position evaluations.

...read moreread less

Proceedings Article

Bootstrapping from Game Tree Search

Joel Veness, +3 more

TL;DR: This paper introduces a new algorithm for updating the parameters of a heuristic evaluation function, by updating the heuristic towards the values computed by an alpha-beta search, and implemented this algorithm in a chess program Meep, using a linear heuristic function.

...read moreread less

Book ChapterDOI

Evaluation in Go by a Neural Network Using Soft Segmentation

Markus Enzenberger

TL;DR: A neural network architecture is presented that is able to build a soft segmentation of a two-dimensional input that is applied to position evaluation in the game of Go.

...read moreread less

Book ChapterDOI

On the scalability of parallel UCT

Richard B. Segal

TL;DR: This paper first analyzes the single-threaded scaling of Fuego and finds that there is an upper bound on the play-quality improvements which can come from additional search, and determines the maximum amount of parallelism supported by MCTS.

...read moreread less