Solving the Rubik’s cube with deep reinforcement learning and search

doi:10.1038/S42256-019-0070-Z

Journal ArticleDOI

Solving the Rubik’s cube with deep reinforcement learning and search

Forest Agostinelli, +3 more

- 01 Aug 2019 -

Nature Machine Intelligence

- Vol. 1, Iss: 8, pp 356-363

TLDR

A new deep learning based search heuristic performs well on the iconic Rubik’s cube and can also generalize to puzzles in which optimal solvers are intractable.

Abstract:

The Rubik’s cube is a prototypical combinatorial puzzle that has a large state space with a single goal state. The goal state is unlikely to be accessed using sequences of randomly generated moves, posing unique challenges for machine learning. We solve the Rubik’s cube with DeepCubeA, a deep reinforcement learning approach that learns how to solve increasingly difficult states in reverse from the goal state without any specific domain knowledge. DeepCubeA solves 100% of all test configurations, finding a shortest path to the goal state 60.3% of the time. DeepCubeA generalizes to other combinatorial puzzles and is able to solve the 15 puzzle, 24 puzzle, 35 puzzle, 48 puzzle, Lights Out and Sokoban, finding a shortest path in the majority of verifiable cases. For some combinatorial puzzles, solutions can be verified to be optimal, for others, the state space is too large to be certain that a solution is optimal. A new deep learning based search heuristic performs well on the iconic Rubik’s cube and can also generalize to puzzles in which optimal solvers are intractable.

Citations

PDF

Open Access

More filters

Posted Content

Model-based Reinforcement Learning: A Survey

Thomas M. Moerland, +2 more

- 30 Jun 2020 -

arXiv: Learning

TL;DR: A survey of the integration of model-based reinforcement learning and planning, better known as model- based reinforcement learning, and a broad conceptual overview of planning-learning combinations for MDP optimization are presented.

...read moreread less

Journal ArticleDOI

Topological Quantum Compiling with Reinforcement Learning.

Yuan-Hang Zhang, +4 more

- 19 Oct 2020 -

Physical Review Letters

TL;DR: An efficient algorithm based on deep reinforcement learning that compiles an arbitrary single-qubit gate into a sequence of elementary gates from a finite universal set is introduced, which generates near-optimal gate sequences with given accuracy.

...read moreread less

Journal ArticleDOI

A Fortran-Keras Deep Learning Bridge for Scientific Computing

Jordan Ott, +5 more

- 28 Aug 2020 -

Scientific Programming

TL;DR: A software library, the Fortran-Keras Bridge (FKB), which connects environments where deep learning resources are plentiful, with those where they are scarce and reveals many neural network architectures that produce considerable improvements in stability including some with reduced error, for an especially challenging training dataset.

...read moreread less

Posted Content

Automated curricula through setter-solver interactions.

Sébastien Racanière, +5 more

- 27 Sep 2019 -

arXiv: Learning

TL;DR: These results represent a substantial step towards applying automatic task curricula to learn complex, otherwise unlearnable goals, and are the first to demonstrate automated curriculum generation for goal-conditioned agents in environments where the possible goals vary between episodes.

...read moreread less

Posted Content

A Fortran-Keras Deep Learning Bridge for Scientific Computing

Jordan Ott, +5 more

- 14 Apr 2020 -

arXiv: Learning

TL;DR: The Fortran-Keras Bridge (FKB) as discussed by the authors enables a hyperparameter search of one hundred plus candidate models of subgrid cloud and radiation physics, initially implemented in Keras, to be transferred and used in Fortran.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings Article

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

...read moreread less

Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Posted Content

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Sergey Ioffe, +1 more

- 11 Feb 2015 -

arXiv: Learning

TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.

...read moreread less

Journal ArticleDOI

Deep learning in neural networks

Jürgen Schmidhuber

- 01 Jan 2015 -

Neural Networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

...read moreread less

Collapse

Related Papers (5)

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

A Formal Basis for the Heuristic Determination of Minimum Cost Paths

Peter E. Hart, +2 more

- 01 Jul 1968 -

IEEE Transactions on Systems Science and...

Solving the Rubik’s cube with deep reinforcement learning and search

Citations

Model-based Reinforcement Learning: A Survey

Topological Quantum Compiling with Reinforcement Learning.

A Fortran-Keras Deep Learning Bridge for Scientific Computing

Automated curricula through setter-solver interactions.

A Fortran-Keras Deep Learning Bridge for Scientific Computing

References

Deep Residual Learning for Image Recognition

Adam: A Method for Stochastic Optimization

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Deep learning in neural networks

Related Papers (5)

Mastering the game of Go without human knowledge

Mastering the game of Go with deep neural networks and tree search

Reinforcement Learning: An Introduction

Human-level control through deep reinforcement learning

A Formal Basis for the Heuristic Determination of Minimum Cost Paths