scispace - formally typeset
Search or ask a question

Showing papers on "Bellman equation published in 1968"


Journal ArticleDOI
TL;DR: Differential dynamic programming as discussed by the authors is a technique, based on dynamic programming rather than the calculus of variations, for determining the optimal control function of a nonlinear system, which applies the principle of optimality in the neighborhood of a nominal, possibly nonoptimal, trajectory.
Abstract: Differential dynamic programming is a technique, based on dynamic programming rather than the calculus of variations, for determining the optimal control function of a nonlinear system. Unlike conventional dynamic programming where the optimal cost function is considered globally, differential dynamic programming applies the principle of optimality in the neighborhood of a nominal, possibly nonoptimal, trajectory. This allows the coefficients of a linear or quadratic expansion of the cost function to be computed in reverse time along the trajectory: these coefficients may then be used to yield a new improved trajectory (i.e., the algorithms are of the "successive sweep" type). A class of nonlinear control problems, linear in the control variables, is studied using differential dynamic programming. It is shown that for the free-end-point problem, the first partial derivatives of the optimal cost function are continuous throughout the state space, and the second partial derivatives experience jumps at switch points of the control function. A control problem that has an aualytic solution is used to illustrate these points. The fixed-end-point problem is converted into an equivalent free-end-point problem by adjoining the end-point constraints to the cost functional using Lagrange multipliers: a useful interpretation for Pontryagin's adjoint variables for this type of problem emerges from this treatment. The above results are used to devise new second- and first-order algorithms for determining the optimal bang-bang control by successively improving a nominal guessed control function. The usefulness of the proposed algorithms is illustrated by the computation of a number of control problem examples.

58 citations


01 Dec 1968
TL;DR: Several extentions and variations of the basic approach developed in the thesis are discussed, including systems with correlated measurement noise, systems with Markov dependent parameters, a special non-adaptive controller configuration, and infinite time adaptive control processes.
Abstract: : The report presents an approach to the design of adaptive controllers for digital control problems involving linear stochastic systems with unknown parameters and performance indices which are quadratic functions of the state and control variables. The method is based upon two approximate solutions of the dynamic programming equation which is associated with the adaptive control problem. One solution constitutes an upper bound on the optimal cost of the adaptive control process, and the second solution constitutes a lower bound on the optimal cost. The upper bound leads to a control system which is linear in the state estimates and which realizes an actual operating cost less than or equal to the upper bound. This linear control system is developed in detail for systems with parameters which assume only finitely many values. Its performance is illustrated with a simple example. Several extentions and variations of the basic approach developed in the thesis are also discussed. These include systems with correlated measurement noise, systems with Markov dependent parameters, a special non-adaptive controller configuration, and infinite time adaptive control processes. (Author)

5 citations


01 Nov 1968
TL;DR: In this article, the Nash equilibrium and non-inferior set (NIN) games are studied in the general non-zero-sum differential game with N players, each controlling a different set of inputs to a single nonlinear dynamic system and each trying to minimize a different performance criterion.
Abstract: : The general nonzero-sum differential game has N players, each controlling a different set of inputs to a single nonlinear dynamic system and each trying to minimize a different performance criterion. Several interesting new phenomena arise in these general games which are absent in the two best-known special cases (the optimal control problem and the two person zero-sum differential game). This paper considers some of the difficulties which arise in attempting to generalize ideas which are well-known in optimal control theory, such as the 'principle of optimality' and the relation between 'open-loop' and 'closed-loop' controls. Two types of 'solutions' are discussed: the 'Nash equilibrium' and the 'noninferior set.' Some simple multistage discrete (bimatrix) games are used to illustrate phenomena which also arise in the continuous formulation. (Author)

3 citations


Journal ArticleDOI
TL;DR: In this paper, the authors used dynamic programming concepts and principles to develop two alternatives to the conventional method of solution for analytically solving a variational problem, which requires the determination of a particular solution, the optimal value function or return function, of the fundamental partial differential equation.
Abstract: The conventional dynamic programming method for analytically solving a variational problem requires the determination of a particular solution, the optimal value function or return function, of the fundamental partial differential equation. Associated with it is another function, the optimal policy function. At each point, this function yields the value of the slope of the optimal curve to that point (or from that point, depending on the method of solution). The optimal curve itself can then be found by integration. In this paper, dynamic programming concepts and principles are used to develop two alternatives to the conventional method of solution. In the first method, a particular solution of two simultaneous partial differential equations is used to generate optimal curves by differentiations and solution of simultaneous equations. In the second method, any solution of the fundamental equation containing an appropriate number of arbitrary constants is sought. It is shown how such a function yields directly, by differentiations and solution of simultaneous equations, the optimal curve for a given problem. While the derivations to follow are new, the results are equivalent to those of a method due to Hamilton and its modification due to Jacobi.

2 citations