scispace - formally typeset
Journal ArticleDOI

Adaptive Optimal Control of Unknown Constrained-Input Systems Using Policy Iteration and Neural Networks

TLDR
This paper presents an online policy iteration (PI) algorithm to learn the continuous-time optimal control solution for unknown constrained-input systems where two neural networks are tuned online and simultaneously to generate the optimal bounded control policy.
Abstract
This paper presents an online policy iteration (PI) algorithm to learn the continuous-time optimal control solution for unknown constrained-input systems. The proposed PI algorithm is implemented on an actor-critic structure where two neural networks (NNs) are tuned online and simultaneously to generate the optimal bounded control policy. The requirement of complete knowledge of the system dynamics is obviated by employing a novel NN identifier in conjunction with the actor and critic NNs. It is shown how the identifier weights estimation error affects the convergence of the critic NN. A novel learning rule is developed to guarantee that the identifier weights converge to small neighborhoods of their ideal values exponentially fast. To provide an easy-to-check persistence of excitation condition, the experience replay technique is used. That is, recorded past experiences are used simultaneously with current data for the adaptation of the identifier weights. Stability of the whole system consisting of the actor, critic, system state, and system identifier is guaranteed while all three networks undergo adaptation. Convergence to a near-optimal control law is also shown. The effectiveness of the proposed method is illustrated with a simulation example.

read more

Citations
More filters
Journal ArticleDOI

Policy Iteration Adaptive Dynamic Programming Algorithm for Discrete-Time Nonlinear Systems

TL;DR: It is shown that the iterative performance index function is nonincreasingly convergent to the optimal solution of the Hamilton-Jacobi-Bellman equation and it is proven that any of the iteratives control laws can stabilize the nonlinear systems.
Journal ArticleDOI

Adaptive Fuzzy Neural Network Control for a Constrained Robot Using Impedance Learning

TL;DR: With the proposed control, the stability of the closed-loop system is achieved via Lyapunov’s stability theory, and the tracking performance is guaranteed under the condition of state constraints and uncertainty.
Journal ArticleDOI

$ {H}_{ {\infty }}$ Tracking Control of Completely Unknown Continuous-Time Systems via Off-Policy Reinforcement Learning

TL;DR: This paper deals with the design of an H∞ tracking controller for nonlinear continuous-time systems with completely unknown dynamics and an off-policy reinforcement learning algorithm is used to learn the solution to the tracking HJI equation online without requiring any knowledge of the system dynamics.
Journal ArticleDOI

Data-Driven Optimal Consensus Control for Discrete-Time Multi-Agent Systems With Unknown Dynamics Using Reinforcement Learning Method

TL;DR: A data-based adaptive dynamic programming method is presented using the current and past system data rather than the accurate system models also instead of the traditional identification scheme which would cause the approximation residual errors.
Journal ArticleDOI

Neural-Learning-Based Telerobot Control With Guaranteed Performance

TL;DR: A neural networks (NNs) enhanced telerobot control system is designed and tested on a Baxter robot and guaranteed performance is achieved at both kinematic and dynamic levels.
References
More filters
Book

Reinforcement Learning: An Introduction

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Journal ArticleDOI

Multilayer feedforward networks are universal approximators

TL;DR: It is rigorously established that standard multilayer feedforward networks with as few as one hidden layer using arbitrary squashing functions are capable of approximating any Borel measurable function from one finite dimensional space to another to any desired degree of accuracy, provided sufficiently many hidden units are available.
Proceedings ArticleDOI

Robust adaptive control

TL;DR: In this article, the authors present a model for dynamic control systems based on Adaptive Control System Design Steps (ACDS) with Adaptive Observers and Parameter Identifiers.

Neuro-Dynamic Programming.

TL;DR: In this article, the authors present the first textbook that fully explains the neuro-dynamic programming/reinforcement learning methodology, which is a recent breakthrough in the practical application of neural networks and dynamic programming to complex problems of planning, optimal decision making, and intelligent control.
Related Papers (5)