scispace - formally typeset
Journal ArticleDOI

Policy-Iteration-Based Learning for Nonlinear Player Game Systems With Constrained Inputs

Reads0
Chats0
TLDR
An adaptive learning algorithm is developed based on policy iteration technique to approximately obtain the Nash equilibrium using real-time data to solve the optimal control problem for nonlinear nonzero-sum differential game in the environment of no initial admissible policies.
Abstract
This article investigates the optimal control problem for nonlinear nonzero-sum differential game in the environment of no initial admissible policies while considering the control constraint. An adaptive learning algorithm is thus developed based on policy iteration technique to approximately obtain the Nash equilibrium using real-time data. A two-player continuous-time system is used to present this approximate mechanism, which is implemented as a critic–actor architecture for every player. The constraint is incorporated into this optimization by introducing the nonquadratic value function, and the associated constrained Hamilton–Jacobi equation is derived. The critic neural network (NN) and actor NN are utilized to learn the value function and the optimal control policy, respectively, in the light of novel weight tuning laws. In order to tackle the stability during the learning phase, two stable operators are designed for two actors. The proposed algorithm is proved to be convergent as a Newton’s iteration, and the stability of this closed-loop system is also ensured by Lyapunov analysis. Finally, two simulation examples demonstrate the effectiveness of the proposed learning scheme by considering different constraint scenes.

read more

Citations
More filters
Journal ArticleDOI

Compensator-critic structure-based neuro-optimal control of modular robot manipulators with uncertain environmental contacts using non-zero-sum games

TL;DR: A novel compensator-critic structure-based neuro-optimal control approach is presented for modular robot manipulators in contact with uncertain environments and the tracking error of the closed-loop robotic system is proved to be uniformly ultimately bounded (UUB) on the basis of the Lyapunov theory.
Journal ArticleDOI

Optimal Synchronization Control of Heterogeneous Asymmetric Input-Constrained Unknown Nonlinear MASs via Reinforcement Learning

TL;DR: A data-based off-policy reinforcement learning (RL) algorithm is presented to learn the solution to the constrained HJB equation without requiring the complete knowledge of the agents' dynamics.
Journal ArticleDOI

Event-triggered design for discrete-time nonlinear systems with control constraints

TL;DR: A non-quadratic function is given to code the control constraints and the trigger condition with the stability analysis is provided and three simulation examples are presented to demonstrate the performance of the proposed event-triggered design for constrained discrete-time nonlinear systems.
Journal ArticleDOI

Asynchronous learning for actor-critic neural networks and synchronous triggering for multiplayer system.

Ke Wang, +1 more
- 01 Feb 2022 - 
TL;DR: In this paper , an actor-critic neural network structure and reinforcement learning scheme was proposed to solve the Nash equilibrium of multiplayer nonzero-sum differential game in an adaptive fashion, where each player consists of one critic and one actor, and implements distributed asynchronous policy iteration to optimize decision-making process.
References
More filters
Journal ArticleDOI

Nonzero-sum differential games

TL;DR: Differential games theory with nonzero sum for application to economic analysis, discussing Nash equilibrium, minimax and noninferior strategies set as mentioned in this paper, discussed Nash equilibrium and non-zero sum.
Journal ArticleDOI

Adaptive Neural Impedance Control of a Robotic Manipulator With Input Saturation

TL;DR: In this article, an adaptive impedance controller for a robotic manipulator with input saturation was developed by employing neural networks. But the adaptive impedance control was not considered in the tracking control design, and the input saturation is handled by designing an auxiliary system.
Journal ArticleDOI

Online learning control by association and reinforcement

TL;DR: In this article, a generic online learning control system based on the fundamental principle of reinforcement learning or more specifically neural dynamic programming is presented. But the authors focus on a systematic treatment for developing a generic RL control system.
Related Papers (5)