scispace - formally typeset
Search or ask a question
Author

Yu Jiang

Bio: Yu Jiang is an academic researcher from New York University. The author has contributed to research in topics: Dynamic programming & Optimal control. The author has an hindex of 18, co-authored 45 publications receiving 2278 citations. Previous affiliations of Yu Jiang include Mitsubishi Electric Research Laboratories & South China University of Technology.

Papers
More filters
Journal ArticleDOI
TL;DR: This paper presents a novel policy iteration approach for finding online adaptive optimal controllers for continuous-time linear systems with completely unknown system dynamics, using the approximate/adaptive dynamic programming technique to iteratively solve the algebraic Riccati equation using the online information of state and input.

723 citations

Journal ArticleDOI
TL;DR: The proposed RADP methodology can be viewed as an extension of ADP to uncertain nonlinear systems and has been applied to the controller design problems for a jet engine and a one-machine power system.
Abstract: This paper studies the robust optimal control design for a class of uncertain nonlinear systems from a perspective of robust adaptive dynamic programming (RADP). The objective is to fill up a gap in the past literature of adaptive dynamic programming (ADP) where dynamic uncertainties or unmodeled dynamics are not addressed. A key strategy is to integrate tools from modern nonlinear control theory, such as the robust redesign and the backstepping techniques as well as the nonlinear small-gain theorem, with the theory of ADP. The proposed RADP methodology can be viewed as an extension of ADP to uncertain nonlinear systems. Practical learning algorithms are developed in this paper, and have been applied to the controller design problems for a jet engine and a one-machine power system.

328 citations

Journal ArticleDOI
TL;DR: In this article, a novel method of global adaptive dynamic programming (ADP) for the adaptive optimal control of nonlinear polynomial systems is presented, which consists of relaxing the problem of solving the Hamilton-Jacobi-Bellman (HJB) equation to an optimization problem, which is solved via a new policy iteration method.
Abstract: This paper presents a novel method of global adaptive dynamic programming (ADP) for the adaptive optimal control of nonlinear polynomial systems. The strategy consists of relaxing the problem of solving the Hamilton-Jacobi-Bellman (HJB) equation to an optimization problem, which is solved via a new policy iteration method. The proposed method distinguishes from previously known nonlinear ADP methods in that the neural network approximation is avoided, giving rise to significant computational improvement. Instead of semiglobally or locally stabilizing, the resultant control policy is globally stabilizing for a general class of nonlinear polynomial systems. Furthermore, in the absence of the a priori knowledge of the system dynamics, an online learning method is devised to implement the proposed policy iteration technique by generalizing the current ADP theory. Finally, three numerical examples are provided to validate the effectiveness of the proposed method.

195 citations

Journal ArticleDOI
TL;DR: A novel optimal control design scheme is proposed for continuous-time nonaffine nonlinear dynamic systems with unknown dynamics by adaptive dynamic programming (ADP), which iteratively updates the control policy online by using the state and input information without identifying the system dynamics.

184 citations

Journal ArticleDOI
TL;DR: The obtained adaptive and optimal output-feedback controllers differ from the existing literature on the ADP in that they are derived from sampled-data systems theory and are guaranteed to be robust to dynamic uncertainties.

183 citations


Cited by
More filters
Book ChapterDOI
11 Dec 2012

1,704 citations

Journal ArticleDOI
TL;DR: In this article, the authors consider the problem of finding the best approximation operator for a given function, and the uniqueness of best approximations and the existence of best approximation operators.
Abstract: Preface 1. The approximation problem and existence of best approximations 2. The uniqueness of best approximations 3. Approximation operators and some approximating functions 4. Polynomial interpolation 5. Divided differences 6. The uniform convergence of polynomial approximations 7. The theory of minimax approximation 8. The exchange algorithm 9. The convergence of the exchange algorithm 10. Rational approximation by the exchange algorithm 11. Least squares approximation 12. Properties of orthogonal polynomials 13. Approximation of periodic functions 14. The theory of best L1 approximation 15. An example of L1 approximation and the discrete case 16. The order of convergence of polynomial approximations 17. The uniform boundedness theorem 18. Interpolation by piecewise polynomials 19. B-splines 20. Convergence properties of spline approximations 21. Knot positions and the calculation of spline approximations 22. The Peano kernel theorem 23. Natural and perfect splines 24. Optimal interpolation Appendices Index.

841 citations

Journal ArticleDOI
TL;DR: This paper presents a novel policy iteration approach for finding online adaptive optimal controllers for continuous-time linear systems with completely unknown system dynamics, using the approximate/adaptive dynamic programming technique to iteratively solve the algebraic Riccati equation using the online information of state and input.

723 citations

Journal ArticleDOI
TL;DR: Q-learning and the integral RL algorithm as core algorithms for discrete time (DT) and continuous-time (CT) systems, respectively are discussed, and a new direction of off-policy RL for both CT and DT systems is discussed.
Abstract: This paper reviews the current state of the art on reinforcement learning (RL)-based feedback control solutions to optimal regulation and tracking of single and multiagent systems. Existing RL solutions to both optimal $\mathcal {H}_{2}$ and $\mathcal {H}_\infty $ control problems, as well as graphical games, will be reviewed. RL methods learn the solution to optimal control and game problems online and using measured data along the system trajectories. We discuss Q-learning and the integral RL algorithm as core algorithms for discrete-time (DT) and continuous-time (CT) systems, respectively. Moreover, we discuss a new direction of off-policy RL for both CT and DT systems. Finally, we review several applications.

536 citations