scispace - formally typeset
Journal ArticleDOI

Human-level control through deep reinforcement learning

Reads0
Chats0
TLDR
This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Abstract
The theory of reinforcement learning provides a normative account, deeply rooted in psychological and neuroscientific perspectives on animal behaviour, of how agents may optimize their control of an environment. To use reinforcement learning successfully in situations approaching real-world complexity, however, agents are confronted with a difficult task: they must derive efficient representations of the environment from high-dimensional sensory inputs, and use these to generalize past experience to new situations. Remarkably, humans and other animals seem to solve this problem through a harmonious combination of reinforcement learning and hierarchical sensory processing systems, the former evidenced by a wealth of neural data revealing notable parallels between the phasic signals emitted by dopaminergic neurons and temporal difference reinforcement learning algorithms. While reinforcement learning agents have achieved some successes in a variety of domains, their applicability has previously been limited to domains in which useful features can be handcrafted, or to domains with fully observed, low-dimensional state spaces. Here we use recent advances in training deep neural networks to develop a novel artificial agent, termed a deep Q-network, that can learn successful policies directly from high-dimensional sensory inputs using end-to-end reinforcement learning. We tested this agent on the challenging domain of classic Atari 2600 games. We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters. This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

read more

Content maybe subject to copyright    Report

Citations
More filters

Essays on modeling and analysis of dynamic sociotechnical systems

TL;DR: This work describes, analyze, and model aspects of three distinct classes of sociotechnical systems: financial markets, social media platforms, and elections, and empirically demonstrates the STAR algorithm’s invariance to quantitative functional parameterization and provides use case examples.
Book

Task Intelligence for Search and Recommendation

TL;DR: This data indicates that information access issues that involve solving tasks a day-to-day in the field of search and recommendation are still challenging and opportunities to address are still available.
Proceedings ArticleDOI

Memetic Evolution Strategy for Reinforcement Learning

TL;DR: A memetic reinforcement learning (MRL) framework that optimizes the RL agent by leveraging both black-box evaluations and temporal frames, that achieves significantly faster convergence than canonical ES.
Journal ArticleDOI

Introducing intents to the OODA-loop

TL;DR: Together with Ericsson AB, the design science framework is used when investigating how to create an intent-driven system for their business support system and its business studio.
Journal ArticleDOI

Recent Advances in Myoelectric Control for Finger Prostheses for Multiple Finger Loss

TL;DR: An overview of myoelectric control regarding finger prosthesis for patients with finger implants following multiple finger loss is presented.
References
More filters
Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Journal ArticleDOI

Gradient-based learning applied to document recognition

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Book

Reinforcement Learning: An Introduction

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Journal Article

Visualizing Data using t-SNE

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.
Journal ArticleDOI

Reducing the Dimensionality of Data with Neural Networks

TL;DR: In this article, an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data is described.
Related Papers (5)