An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm

doi:10.1109/ACCESS.2019.2961174

Open AccessJournal ArticleDOI

An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm

Sergio Spanò, +7 more

- 20 Dec 2019 -

IEEE Access

- Vol. 7, pp 186340-186351

Chats0

TLDR

An efficient hardware architecture that implements the Q-Learning algorithm, suitable for real-time applications, with low-power, high throughput and limited hardware resources, and a technique based on approximated multipliers to reduce the hardware complexity of the algorithm.

Abstract:

In this paper we propose an efficient hardware architecture that implements the Q-Learning algorithm, suitable for real-time applications. Its main features are low-power, high throughput and limited hardware resources. We also propose a technique based on approximated multipliers to reduce the hardware complexity of the algorithm. We implemented the design on a Xilinx Zynq Ultrascale+ MPSoC ZCU106 Evaluation Kit. The implementation results are evaluated in terms of hardware resources, throughput and power consumption. The architecture is compared to the state of the art of Q-Learning hardware accelerators presented in the literature obtaining better results in speed, power and hardware resources. Experiments using different sizes for the Q-Matrix and different wordlengths for the fixed point arithmetic are presented. With a Q-Matrix of size $8\times4$ (8 bit data) we achieved a throughput of 222 MSPS (Mega Samples Per Second) and a dynamic power consumption of 37 mW, while with a Q-Matrix of size $256\times16$ (32 bit data) we achieved a throughput of 93 MSPS and a power consumption 611 mW. Due to the small amount of hardware resources required by the accelerator, our system is suitable for multi-agent IoT applications. Moreover, the architecture can be used to implement the SARSA (State-Action-Reward-State-Action) Reinforcement Learning algorithm with minor modifications.

An Efficient Hardware Implementation of Reinforcement Learning: The Q-Learning Algorithm

Citations

Printed synaptic transistor-based electronic skin for robots to feel and learn

Indoor Localization System Based on Bluetooth Low Energy for Museum Applications

Optimized CNNs to Indoor Localization through BLE Sensors Using Improved PSO.

Machine Learning Approaches for Smart City Applications: Emergence, Challenges and Opportunities

A Survey of Domain-Specific Architectures for Reinforcement Learning

References

Reinforcement Learning: An Introduction

Technical Note : \cal Q -Learning

Trust Region Policy Optimization

Consistency in Networks of Relations

Low-power CMOS digital design

Related Papers (5)

Reinforcement Learning: An Introduction

Parallel Implementation of Reinforcement Learning Q-Learning Technique for FPGA

Improve memory access for achieving both performance and energy efficiencies on heterogeneous systems

FantastIC4: A Hardware-Software Co-Design Approach for Efficiently Running 4Bit-Compact Multilayer Perceptrons

Architecture of A Novel Low-Cost Hardware Neural Network

Trending Questions (1)