scispace - formally typeset
Search or ask a question
Topic

Bellman equation

About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.


Papers
More filters
Proceedings Article
01 Jan 2009
TL;DR: The need of the partial knowledge of the nonlinear system dynamics is relaxed in the development of a novel approach to ADP using a two part process: online system identification and offline optimal control training.
Abstract: The optimal control of linear systems accompanied by quadratic cost functions can be achieved by solving the well-known Riccati equation. However, the optimal control of nonlinear discrete-time systems is a much more challenging task that often requires solving the nonlinear Hamilton―Jacobi―Bellman (HJB) equation. In the recent literature, discrete-time approximate dynamic programming (ADP) techniques have been widely used to determine the optimal or near optimal control policies for affine nonlinear discrete-time systems. However, an inherent assumption of ADP requires the value of the controlled system one step ahead and at least partial knowledge of the system dynamics to be known. In this work, the need of the partial knowledge of the nonlinear system dynamics is relaxed in the development of a novel approach to ADP using a two part process: online system identification and offline optimal control training. First, in the system identification process, a neural network (NN) is tuned online using novel tuning laws to learn the complete plant dynamics so that a local asymptotic stability of the identification error can be shown. Then, using only the learned NN system model, offline ADP is attempted resulting in a novel optimal control law. The proposed scheme does not require explicit knowledge of the system dynamics as only the learned NN model is needed. The proof of convergence is demonstrated. Simulation results verify theoretical conjecture.

131 citations

Journal ArticleDOI
TL;DR: In this article, several characterizations of optimal trajectories for the classical Mayer problem in optimal control are provided, and the problem of optimal design is addressed, obtaining sufficient conditions for optimality.
Abstract: Several characterizations of optimal trajectories for the classical Mayer problem in optimal control are provided. For this purpose the regularity of directional derivatives of the value function is studied: for instance, it is shown that for smooth control systems the value function V is continuously differentiable along an optimal trajectory $x:[t_0 ,1] \to {\bf R}^n $ provided V is differentiable at the initial point $(t_0 ,x(t_0 ))$.Then the upper semicontinuity of the optimal feedback map is deduced. The problem of optimal design is addressed, obtaining sufficient conditions for optimality. Finally, it is shown that the optimal control problem may be reduced to a viability one.

130 citations

Journal ArticleDOI
TL;DR: The key step in the proof of these new estimates is the introduction of a switching system which allows the construction of approximate, (almost) smooth supersolutions for the Hamilton--Jacobi--Bellman equation.
Abstract: We obtain error bounds for monotone approximation schemes of Hamilton--Jacobi--Bellman equations. These bounds improve previous results of Krylov and the authors. The key step in the proof of these new estimates is the introduction of a switching system which allows the construction of approximate, (almost) smooth supersolutions for the Hamilton--Jacobi--Bellman equation.

129 citations

Journal ArticleDOI
TL;DR: In this article, the authors present a general framework for deriving continuous depen-dence estimates for, possibly polynomially growing, viscosity solutions of fully nonlinear degenerate parabolic integro-PDEs.

128 citations

Journal ArticleDOI
TL;DR: Monotonicity of the local value iteration ADP algorithm is presented, which shows that under some special conditions of the initial value function and the learning rate function, the iterative value function can monotonically converge to the optimum.
Abstract: In this paper, convergence properties are established for the newly developed discrete-time local value iteration adaptive dynamic programming (ADP) algorithm. The present local iterative ADP algorithm permits an arbitrary positive semidefinite function to initialize the algorithm. Employing a state-dependent learning rate function, for the first time, the iterative value function and iterative control law can be updated in a subset of the state space instead of the whole state space, which effectively relaxes the computational burden. A new analysis method for the convergence property is developed to prove that the iterative value functions will converge to the optimum under some mild constraints. Monotonicity of the local value iteration ADP algorithm is presented, which shows that under some special conditions of the initial value function and the learning rate function, the iterative value function can monotonically converge to the optimum. Finally, three simulation examples and comparisons are given to illustrate the performance of the developed algorithm.

128 citations


Network Information
Related Topics (5)
Optimal control
68K papers, 1.2M citations
87% related
Bounded function
77.2K papers, 1.3M citations
85% related
Markov chain
51.9K papers, 1.3M citations
85% related
Linear system
59.5K papers, 1.4M citations
84% related
Optimization problem
96.4K papers, 2.1M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023261
2022537
2021369
2020411
2019348
2018353