scispace - formally typeset
Journal ArticleDOI

Solving the Rubik’s cube with deep reinforcement learning and search

TLDR
A new deep learning based search heuristic performs well on the iconic Rubik’s cube and can also generalize to puzzles in which optimal solvers are intractable.
Abstract
The Rubik’s cube is a prototypical combinatorial puzzle that has a large state space with a single goal state. The goal state is unlikely to be accessed using sequences of randomly generated moves, posing unique challenges for machine learning. We solve the Rubik’s cube with DeepCubeA, a deep reinforcement learning approach that learns how to solve increasingly difficult states in reverse from the goal state without any specific domain knowledge. DeepCubeA solves 100% of all test configurations, finding a shortest path to the goal state 60.3% of the time. DeepCubeA generalizes to other combinatorial puzzles and is able to solve the 15 puzzle, 24 puzzle, 35 puzzle, 48 puzzle, Lights Out and Sokoban, finding a shortest path in the majority of verifiable cases. For some combinatorial puzzles, solutions can be verified to be optimal, for others, the state space is too large to be certain that a solution is optimal. A new deep learning based search heuristic performs well on the iconic Rubik’s cube and can also generalize to puzzles in which optimal solvers are intractable.

read more

Citations
More filters
Posted Content

Model-based Reinforcement Learning: A Survey

TL;DR: A survey of the integration of model-based reinforcement learning and planning, better known as model- based reinforcement learning, and a broad conceptual overview of planning-learning combinations for MDP optimization are presented.
Journal ArticleDOI

Topological Quantum Compiling with Reinforcement Learning.

TL;DR: An efficient algorithm based on deep reinforcement learning that compiles an arbitrary single-qubit gate into a sequence of elementary gates from a finite universal set is introduced, which generates near-optimal gate sequences with given accuracy.
Journal ArticleDOI

A Fortran-Keras Deep Learning Bridge for Scientific Computing

TL;DR: A software library, the Fortran-Keras Bridge (FKB), which connects environments where deep learning resources are plentiful, with those where they are scarce and reveals many neural network architectures that produce considerable improvements in stability including some with reduced error, for an especially challenging training dataset.
Posted Content

Automated curricula through setter-solver interactions.

TL;DR: These results represent a substantial step towards applying automatic task curricula to learn complex, otherwise unlearnable goals, and are the first to demonstrate automated curriculum generation for goal-conditioned agents in environments where the possible goals vary between episodes.
Posted Content

A Fortran-Keras Deep Learning Bridge for Scientific Computing

TL;DR: The Fortran-Keras Bridge (FKB) as discussed by the authors enables a hyperparameter search of one hundred plus candidate models of subgrid cloud and radiation physics, initially implemented in Keras, to be transferred and used in Fortran.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Posted Content

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Batch Normalization as mentioned in this paper normalizes layer inputs for each training mini-batch to reduce the internal covariate shift in deep neural networks, and achieves state-of-the-art performance on ImageNet.
Journal ArticleDOI

Deep learning in neural networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.
Related Papers (5)