scispace - formally typeset
Search or ask a question

Showing papers on "Bellman equation published in 1969"


Journal ArticleDOI
TL;DR: A perturbation method of solving the Hamilton-Jacobi-Bellman partial differential equation for the optimal value function in non-linear control problems and a second order on-line control scheme based on this perturbations series are described.

22 citations


Journal ArticleDOI
01 May 1969
TL;DR: A brief outline of how the method of dynamic programming extends to the control of a Markov process is given and some of the difficulties which arise when information about the state of the process is statistical are mentioned.
Abstract: THE aim of this paper is to give a brief outline of how the method of dynamic programming extends to the control of a Markov process and to mention some of the difficulties which arise when information about the state of the process is statistical. The development of the subject has been rapid and, whilst this is encouraging, the corresponding growth in the literature is alarming. There will be no attempt here even to do justice to the main contributions. Our discussion will be limited to a comparison of various types of model and some of the different techniques which may be useful in formulating and solving problems of optimization. The application of Pontryagin's maximum principle to deterministic systems depends on the existence of a unique optimal path between the present state and some terminal state, since the solution is developed by integration backwards along this path. But, for stochastic systems, no such path can be defined and because of this, attempts to generalize the principle do not seem to have been very successful: see Kushner (1965). On the other hand, a Markovian point of view is already implicit in Bellman's principle of optimality: immediate costs should be minimized with future costs in mind, but the past may be neglected. There is no difficulty in extending this to any Markov process, simply by treating future expectations rather than actual costs. But an inductive method of constructing optimal policies is no more than a first step towards understanding the relation between policy and expectation. The determination of more explicit solutions depends a great deal on the simplifying assumptions made in choosing a particular mathematical model. A discussion of simple diffusion models is included here to illustrate the advantages obtained when we can rely on the methods of differential calculus. The detailed example of an insurance model is an oversimplified representation of any real business, but it reflects some features of insurance sensibly and, because of its simplicity, it is open to more informed criticism and hence, perhaps, to more realistic development. Stochastic control theory is concerned with decision-making when there is uncertainty about the effect of particular decisions on the future course of events. The need for special care in model building becomes even more apparent when we admit also that decisions must be based on approximate information. The investigation of optimal decisions depends critically on whether this information can be represented conveniently; for example, in terms of a small number of sufficient statistics. Again, it seems necessary to adopt a Bayesian approach to unknown parameters, if we are to retain the Markovian structure of the decision process. The statistical aspects of control theory deserve more attention. So far, progress has perhaps been limited by a static view of unknown parameters and one purpose of the remarks at the end of this paper is to suggest a re-examination of this view.

8 citations



Journal ArticleDOI
TL;DR: In this article, the authors considered the problem of finding a relationship between optimal control problems and differential games and showed that two classes of differential games have the property that their solution, the value function, can be constructed from solutions to associated one-player optimization problems.
Abstract: The task of finding a relationship between optimal control problems and differential games is considered. Two classes of differential games are shown to have the property that their solution, the value function, can be constructed from solutions to associated one-player optimal control problems.

2 citations


Journal ArticleDOI
TL;DR: The Bellman function is the envelope of a set of Krotov functions as mentioned in this paper, and it is defined as a function whose return function is less sharp than that of the krotov function.
Abstract: Every Krotov function in some dynamic programming problem is identical with a Bellman function in another problem with the same conditions and with a return function of a less “sharp” optimum The Bellman function is the envelope of a set of Krotov functions

1 citations


Journal ArticleDOI
TL;DR: In this article, the authors analyzed the Bellman's equation by taking derivatives of all orders and found that the third and higher derivatives of the optimal value function satisfy linear differential equations along the optimal trajectory.