Journal ArticleDOI
Policy-Iteration-Based Learning for Nonlinear Player Game Systems With Constrained Inputs
Chaoxu Mu,Ke Wang,Changyin Sun +2 more
Reads0
Chats0
TLDR
An adaptive learning algorithm is developed based on policy iteration technique to approximately obtain the Nash equilibrium using real-time data to solve the optimal control problem for nonlinear nonzero-sum differential game in the environment of no initial admissible policies.Abstract:
This article investigates the optimal control problem for nonlinear nonzero-sum differential game in the environment of no initial admissible policies while considering the control constraint. An adaptive learning algorithm is thus developed based on policy iteration technique to approximately obtain the Nash equilibrium using real-time data. A two-player continuous-time system is used to present this approximate mechanism, which is implemented as a critic–actor architecture for every player. The constraint is incorporated into this optimization by introducing the nonquadratic value function, and the associated constrained Hamilton–Jacobi equation is derived. The critic neural network (NN) and actor NN are utilized to learn the value function and the optimal control policy, respectively, in the light of novel weight tuning laws. In order to tackle the stability during the learning phase, two stable operators are designed for two actors. The proposed algorithm is proved to be convergent as a Newton’s iteration, and the stability of this closed-loop system is also ensured by Lyapunov analysis. Finally, two simulation examples demonstrate the effectiveness of the proposed learning scheme by considering different constraint scenes.read more
Citations
More filters
Journal ArticleDOI
Compensator-critic structure-based neuro-optimal control of modular robot manipulators with uncertain environmental contacts using non-zero-sum games
TL;DR: A novel compensator-critic structure-based neuro-optimal control approach is presented for modular robot manipulators in contact with uncertain environments and the tracking error of the closed-loop robotic system is proved to be uniformly ultimately bounded (UUB) on the basis of the Lyapunov theory.
Journal ArticleDOI
Optimal Synchronization Control of Heterogeneous Asymmetric Input-Constrained Unknown Nonlinear MASs via Reinforcement Learning
TL;DR: A data-based off-policy reinforcement learning (RL) algorithm is presented to learn the solution to the constrained HJB equation without requiring the complete knowledge of the agents' dynamics.
Journal ArticleDOI
Event-triggered design for discrete-time nonlinear systems with control constraints
Chaoxu Mu,Kaiju Liao,Ke Wang +2 more
TL;DR: A non-quadratic function is given to code the control constraints and the trigger condition with the stability analysis is provided and three simulation examples are presented to demonstrate the performance of the proposed event-triggered design for constrained discrete-time nonlinear systems.
Journal ArticleDOI
Asynchronous learning for actor-critic neural networks and synchronous triggering for multiplayer system.
TL;DR: In this paper , an actor-critic neural network structure and reinforcement learning scheme was proposed to solve the Nash equilibrium of multiplayer nonzero-sum differential game in an adaptive fashion, where each player consists of one critic and one actor, and implements distributed asynchronous policy iteration to optimize decision-making process.
References
More filters
Journal ArticleDOI
Nonzero-sum differential games
A. W. Starr,Yu-Chi Ho +1 more
TL;DR: Differential games theory with nonzero sum for application to economic analysis, discussing Nash equilibrium, minimax and noninferior strategies set as mentioned in this paper, discussed Nash equilibrium and non-zero sum.
Journal ArticleDOI
Adaptive Neural Impedance Control of a Robotic Manipulator With Input Saturation
Wei He,Yiting Dong,Changyin Sun +2 more
TL;DR: In this article, an adaptive impedance controller for a robotic manipulator with input saturation was developed by employing neural networks. But the adaptive impedance control was not considered in the tracking control design, and the input saturation is handled by designing an auxiliary system.
Journal ArticleDOI
Online learning control by association and reinforcement
Jennie Si,Yu-Tsung Wang +1 more
TL;DR: In this article, a generic online learning control system based on the fundamental principle of reinforcement learning or more specifically neural dynamic programming is presented. But the authors focus on a systematic treatment for developing a generic RL control system.
Related Papers (5)
Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations
Approximate Nash Solutions for Multiplayer Mixed-Zero-Sum Game With Reinforcement Learning
Yongfeng Lv,Xuemei Ren +1 more