Topic
Bellman equation
About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.
Papers published on a yearly basis
Papers
More filters
•
01 Jan 2009TL;DR: The need of the partial knowledge of the nonlinear system dynamics is relaxed in the development of a novel approach to ADP using a two part process: online system identification and offline optimal control training.
Abstract: The optimal control of linear systems accompanied by quadratic cost functions can be achieved by solving the well-known Riccati equation. However, the optimal control of nonlinear discrete-time systems is a much more challenging task that often requires solving the nonlinear Hamilton―Jacobi―Bellman (HJB) equation. In the recent literature, discrete-time approximate dynamic programming (ADP) techniques have been widely used to determine the optimal or near optimal control policies for affine nonlinear discrete-time systems. However, an inherent assumption of ADP requires the value of the controlled system one step ahead and at least partial knowledge of the system dynamics to be known. In this work, the need of the partial knowledge of the nonlinear system dynamics is relaxed in the development of a novel approach to ADP using a two part process: online system identification and offline optimal control training. First, in the system identification process, a neural network (NN) is tuned online using novel tuning laws to learn the complete plant dynamics so that a local asymptotic stability of the identification error can be shown. Then, using only the learned NN system model, offline ADP is attempted resulting in a novel optimal control law. The proposed scheme does not require explicit knowledge of the system dynamics as only the learned NN model is needed. The proof of convergence is demonstrated. Simulation results verify theoretical conjecture.
131 citations
••
TL;DR: In this article, several characterizations of optimal trajectories for the classical Mayer problem in optimal control are provided, and the problem of optimal design is addressed, obtaining sufficient conditions for optimality.
Abstract: Several characterizations of optimal trajectories for the classical Mayer problem in optimal control are provided. For this purpose the regularity of directional derivatives of the value function is studied: for instance, it is shown that for smooth control systems the value function V is continuously differentiable along an optimal trajectory $x:[t_0 ,1] \to {\bf R}^n $ provided V is differentiable at the initial point $(t_0 ,x(t_0 ))$.Then the upper semicontinuity of the optimal feedback map is deduced. The problem of optimal design is addressed, obtaining sufficient conditions for optimality. Finally, it is shown that the optimal control problem may be reduced to a viability one.
130 citations
••
TL;DR: The key step in the proof of these new estimates is the introduction of a switching system which allows the construction of approximate, (almost) smooth supersolutions for the Hamilton--Jacobi--Bellman equation.
Abstract: We obtain error bounds for monotone approximation schemes of Hamilton--Jacobi--Bellman equations. These bounds improve previous results of Krylov and the authors. The key step in the proof of these new estimates is the introduction of a switching system which allows the construction of approximate, (almost) smooth supersolutions for the Hamilton--Jacobi--Bellman equation.
129 citations
••
TL;DR: In this article, the authors present a general framework for deriving continuous depen-dence estimates for, possibly polynomially growing, viscosity solutions of fully nonlinear degenerate parabolic integro-PDEs.
128 citations
••
TL;DR: Monotonicity of the local value iteration ADP algorithm is presented, which shows that under some special conditions of the initial value function and the learning rate function, the iterative value function can monotonically converge to the optimum.
Abstract: In this paper, convergence properties are established for the newly developed discrete-time local value iteration adaptive dynamic programming (ADP) algorithm. The present local iterative ADP algorithm permits an arbitrary positive semidefinite function to initialize the algorithm. Employing a state-dependent learning rate function, for the first time, the iterative value function and iterative control law can be updated in a subset of the state space instead of the whole state space, which effectively relaxes the computational burden. A new analysis method for the convergence property is developed to prove that the iterative value functions will converge to the optimum under some mild constraints. Monotonicity of the local value iteration ADP algorithm is presented, which shows that under some special conditions of the initial value function and the learning rate function, the iterative value function can monotonically converge to the optimum. Finally, three simulation examples and comparisons are given to illustrate the performance of the developed algorithm.
128 citations