Mastering the game of Go without human knowledge

doi:10.1038/NATURE24270

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Mastering the game of Go with deep neural networks and tree search

[...]

David Silver¹, Aja Huang¹, Chris J. Maddison¹, Arthur Guez¹, Laurent Sifre¹, George van den Driessche¹, Julian Schrittwieser¹, Ioannis Antonoglou¹, Veda Panneershelvam¹, Marc Lanctot¹, Sander Dieleman¹, Dominik Grewe¹, John Nham¹, Nal Kalchbrenner¹, Ilya Sutskever¹, Timothy P. Lillicrap¹, Madeleine Leach¹, Koray Kavukcuoglu¹, Thore Graepel¹, Demis Hassabis¹ - Show less +16 more•Institutions (1)

Google¹

28 Jan 2016-Nature

TL;DR: Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.

...read moreread less

Abstract: The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of stateof-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.

...read moreread less

14,377 citations

Cites background from "Mastering the game of Go without hu..."

...Moreover, the DeepMind team achieved even superhuman performance with AlphaGo’s sucessors AlphaGo Zero [7] and AlphaZero [5]....
[...]

Journal Article•DOI•

A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play.

[...]

David Silver¹, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, Timothy P. Lillicrap, Karen Simonyan, Demis Hassabis - Show less +9 more•Institutions (1)

University College London¹

07 Dec 2018-Science

TL;DR: This paper generalizes the AlphaZero approach into a single AlphaZero algorithm that can achieve superhuman performance in many challenging games, and convincingly defeated a world champion program in the games of chess and shogi (Japanese chess), as well as Go.

...read moreread less

Abstract: The game of chess is the longest-studied domain in the history of artificial intelligence. The strongest programs are based on a combination of sophisticated search techniques, domain-specific adaptations, and handcrafted evaluation functions that have been refined by human experts over several decades. By contrast, the AlphaGo Zero program recently achieved superhuman performance in the game of Go by reinforcement learning from self-play. In this paper, we generalize this approach into a single AlphaZero algorithm that can achieve superhuman performance in many challenging games. Starting from random play and given no domain knowledge except the game rules, AlphaZero convincingly defeated a world champion program in the games of chess and shogi (Japanese chess), as well as Go.

...read moreread less

2,603 citations

Book•

Neural Networks and Deep Learning

[...]

Charu C. Aggarwal

01 Jan 2018

2,291 citations

Cites background or methods from "Mastering the game of Go without hu..."

...However, subsequent work [446] showed that dispensing with this part of the learning was a better option....
[...]
...Later versions of AlphaGo dispensed with the supervised portions of learning, adapted to chess and shogi, and performed better with zero initial knowledge [446, 447]....
[...]
...Subsequently, it was used in the game of Go [135, 346, 445, 446]....
[...]
...A later enhancement of the idea, referred to as AlphaGo Zero [446], removed the need for human expert moves (or an SL-network)....
[...]

Journal Article•DOI•

Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)

[...]

Amina Adadi¹, Mohammed Berrada¹•Institutions (1)

SIDI¹

17 Sep 2018-IEEE Access

TL;DR: This survey provides an entry point for interested researchers and practitioners to learn key aspects of the young and rapidly growing body of research related to XAI, and review the existing approaches regarding the topic, discuss trends surrounding its sphere, and present major research trajectories.

...read moreread less

Abstract: At the dawn of the fourth industrial revolution, we are witnessing a fast and widespread adoption of artificial intelligence (AI) in our daily life, which contributes to accelerating the shift towards a more algorithmic society. However, even with such unprecedented advancements, a key impediment to the use of AI-based systems is that they often lack transparency. Indeed, the black-box nature of these systems allows powerful predictions, but it cannot be directly explained. This issue has triggered a new debate on explainable AI (XAI). A research field holds substantial promise for improving trust and transparency of AI-based systems. It is recognized as the sine qua non for AI to continue making steady progress without disruption. This survey provides an entry point for interested researchers and practitioners to learn key aspects of the young and rapidly growing body of research related to XAI. Through the lens of the literature, we review the existing approaches regarding the topic, discuss trends surrounding its sphere, and present major research trajectories.

...read moreread less

2,258 citations

Cites background from "Mastering the game of Go without hu..."

...For example, given that AlphaGo Zero [40] can excel at the game of Go much better than human players, it would be desirable that the machine can explain its learned strategy (knowledge) to us....
[...]

Posted Content•

Group Normalization

[...]

Yuxin Wu, Kaiming He

22 Mar 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: Group Normalization can outperform its BN-based counterparts for object detection and segmentation in COCO, and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks.

...read moreread less

Abstract: Batch Normalization (BN) is a milestone technique in the development of deep learning, enabling various networks to train. However, normalizing along the batch dimension introduces problems --- BN's error increases rapidly when the batch size becomes smaller, caused by inaccurate batch statistics estimation. This limits BN's usage for training larger models and transferring features to computer vision tasks including detection, segmentation, and video, which require small batches constrained by memory consumption. In this paper, we present Group Normalization (GN) as a simple alternative to BN. GN divides the channels into groups and computes within each group the mean and variance for normalization. GN's computation is independent of batch sizes, and its accuracy is stable in a wide range of batch sizes. On ResNet-50 trained in ImageNet, GN has 10.6% lower error than its BN counterpart when using a batch size of 2; when using typical batch sizes, GN is comparably good with BN and outperforms other normalization variants. Moreover, GN can be naturally transferred from pre-training to fine-tuning. GN can outperform its BN-based counterparts for object detection and segmentation in COCO, and for video classification in Kinetics, showing that GN can effectively replace the powerful BN in a variety of tasks. GN can be easily implemented by a few lines of code in modern libraries.

...read moreread less

1,924 citations

Collapse

Mastering the game of Go without human knowledge

Citations

Cites background from "Mastering the game of Go without hu..."

Cites background or methods from "Mastering the game of Go without hu..."

Cites background from "Mastering the game of Go without hu..."

References

Related Papers (5)