scispace - formally typeset
Journal ArticleDOI

Mastering the game of Go with deep neural networks and tree search

TLDR
Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.
Abstract
The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of stateof-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Dermatologist-level classification of skin cancer with deep neural networks

TL;DR: This work demonstrates an artificial intelligence capable of classifying skin cancer with a level of competence comparable to dermatologists, trained end-to-end from images directly, using only pixels and disease labels as inputs.
Journal ArticleDOI

Mastering the game of Go without human knowledge

TL;DR: An algorithm based solely on reinforcement learning is introduced, without human data, guidance or domain knowledge beyond game rules, that achieves superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.
Proceedings ArticleDOI

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

TL;DR: This work combines existing fine-grained visualizations to create a high-resolution class-discriminative visualization, Guided Grad-CAM, and applies it to image classification, image captioning, and visual question answering (VQA) models, including ResNet-based architectures.
Proceedings ArticleDOI

Towards Evaluating the Robustness of Neural Networks

TL;DR: In this paper, the authors demonstrate that defensive distillation does not significantly increase the robustness of neural networks by introducing three new attack algorithms that are successful on both distilled and undistilled neural networks with 100% probability.
Journal ArticleDOI

Places: A 10 Million Image Database for Scene Recognition

TL;DR: The Places Database is described, a repository of 10 million scene photographs, labeled with scene semantic categories, comprising a large and diverse list of the types of environments encountered in the world, using the state-of-the-art Convolutional Neural Networks as baselines, that significantly outperform the previous approaches.
References
More filters
Journal Article

From simple features to sophisticated evaluation functions

TL;DR: A practical framework for the semi-automatic construction of evaluation-functions for games based on a structured evaluation function representation is presented that is able to discover new features in a computationally feasible way.
Proceedings ArticleDOI

Monte-Carlo simulation balancing

TL;DR: The main idea is to optimise the balance of a simulation policy, so that an accurate spread of simulation outcomes is maintained, rather than optimising the direct strength of the simulation policy.
Proceedings Article

Temporal difference learning applied to a high-performance game-playing program

TL;DR: This paper shows that TD learinng is capable of competing with the best human effort.
Book ChapterDOI

Whole-History Rating: A Bayesian Rating System for Players of Time-Varying Strength

TL;DR: Experiments demonstrate that, in comparison to Elo, Glicko, TrueSkill, and decayed-history algorithms, WHR produces better predictions.
Proceedings ArticleDOI

Bayesian pattern ranking for move prediction in the game of Go

TL;DR: A probability distribution over legal moves for professional play in a given position in Go is obtained and shows excellent prediction performance as indicated by its ability to perfectly predict the moves made by professional Go players in 34% of test positions.
Related Papers (5)