Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks

doi:10.1109/TNNLS.2013.2276571

Journal ArticleDOI

Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks

Hamidreza Modares, +2 more

- 22 Aug 2013 -

IEEE Transactions on Neural Networks

- Vol. 24, Iss: 10, pp 1513-1525

TLDR

This paper presents an online policy iteration (PI) algorithm to learn the continuous-time optimal control solution for unknown constrained-input systems where two neural networks are tuned online and simultaneously to generate the optimal bounded control policy.

Abstract:

This paper presents an online policy iteration (PI) algorithm to learn the continuous-time optimal control solution for unknown constrained-input systems. The proposed PI algorithm is implemented on an actor-critic structure where two neural networks (NNs) are tuned online and simultaneously to generate the optimal bounded control policy. The requirement of complete knowledge of the system dynamics is obviated by employing a novel NN identifier in conjunction with the actor and critic NNs. It is shown how the identifier weights estimation error affects the convergence of the critic NN. A novel learning rule is developed to guarantee that the identifier weights converge to small neighborhoods of their ideal values exponentially fast. To provide an easy-to-check persistence of excitation condition, the experience replay technique is used. That is, recorded past experiences are used simultaneously with current data for the adaptation of the identifier weights. Stability of the whole system consisting of the actor, critic, system state, and system identifier is guaranteed while all three networks undergo adaptation. Convergence to a near-optimal control law is also shown. The effectiveness of the proposed method is illustrated with a simulation example.

Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks

Citations

Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems

Adaptive Fuzzy Neural Network Control for a Constrained Robot Using Impedance Learning

$ {H}_{ {\infty }}$ Tracking Control of Completely Unknown Continuous-Time Systems via Off-Policy Reinforcement Learning

Data-Driven Optimal Consensus Control for Discrete-Time Multi-Agent Systems With Unknown Dynamics Using Reinforcement Learning Method

Neural-Learning-Based Telerobot Control With Guaranteed Performance

References

Reinforcement Learning: An Introduction

Multilayer feedforward networks are universal approximators

Multilayer feedforward networks are universal approximators

Robust adaptive control

Neuro-Dynamic Programming.

Related Papers (5)

Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem

Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach

Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof

Reinforcement learning and adaptive dynamic programming for feedback control

Reinforcement Learning: An Introduction