scispace - formally typeset
Search or ask a question

Showing papers on "Bellman equation published in 1974"


01 Jan 1974
TL;DR: In this paper, a Banach space with a weighted supremum norm is introduced to guarantee convergence of successive approximations to the value function, which are weaker then those required by the usual supnorm approach.
Abstract: Markovian decision processes are considered in the situation of discrete time. countable state space. and general decision space. By introducing a Banach space with a weighted supremum norm, conditions are derived, which guarantee convergence of successive approximations to the value function. These conditions are weaker then those required by the usual supnorm approach. Several properties of the successive approximations are derived.

95 citations


Journal ArticleDOI
TL;DR: In this article, two person zero-sum differential games of survival are considered; these terminate as soon as the trajectory enters a given closed set F, at which time a cost or payoff is computed.
Abstract: Two person zero sum differential games of survival are considered; these terminate as soon as the trajectory enters a given closed set F, at which time a cost or payoff is computed. One controller, or player, chooses his control values to make the payoff as large as possible, the other player chooses his controls to make the payoff as small as possible. A strategy is a function telling a player how to choose his control variable and values of the game are introduced in connection with there being a delay before a player adopts a strategy. It is shown that various values of the differential game satisfy dynamic programming identities or inequalities and these results enable one to show that if the value functions are continuous on the boundary of F then they are continuous everywhere. To discuss continuity of the values on the boundary of F certain comparison theorems for the values of the game are established. In particular if there are suband super-solutions of a related Isaacs-Bellman equation then these provide upper and lower bounds for the appropriate value function. Thus in discussingyalue functions of a game of survival one is studying solutions of a Cauchy problem for the Isaacs-Bellman equation and there are interesting analogies with certain techniques of classical potential theory.

48 citations


Journal ArticleDOI
TL;DR: In this paper, the joint plant and measurement control problem of linear, unknown, discrete time systems excited by white Ganssian noise is considered, and the performance criterion is quadratic in the state and is additive in the plant and control.
Abstract: In this paper, the joint plant and measurement control problem of linear, unknown, discrete time systems excited by white Ganssian noise is considered. The performance criterion is quadratic in the state and is additive in the plant and measurement control. The adaptive control solution is obtained by approximating the dynamic programming equation-the approximation amounts to replacing the optimal adaptive cost-to-go in the dynamic programming equation by the average value of the truly optimum cost-to-go for each admissible model. In our solution, the adaptive plant and measurement control schemes can be separated. The adaptive plant control is given by the product of the weighted integrals with the a posteriori probability of the parameter as weights. The adaptive measurement control scheme is obtained as the solution of a constrained nonlinear, optimization problem for each time; the constraint equations being the error covatiance matrix equations in the Kalman filter. An illustrative example of the optimum timing of measurements is discussed where the joint adaptive control scheme is simulated and its performance is compared with the optimum value of the performance if the system parameters were completely known.

12 citations


Journal ArticleDOI
TL;DR: In this article, the authors apply the concept of dynamic programming to derive the eikonal equation from Fermat's principle of least time for anisotropic media, which is a natural extension of his treatment.
Abstract: In this note, we apply the concept of dynamic programming to derive the eikonal equation from Fermat’s principle of least time for anisotropic media. The derivation for isotropic media was given by Kalaba and the result of the present paper is a natural extension of his treatment. The key to the derivation is Bellman’s principle of optimality, which is stated below. First, we derive the eikonal equation for isotropic media, for three dimensions, because Kalaba’s derivation was restricted to two dimensions. After this we establish the result for anisotropic media. We follow Kalaba’s derivation closely.

5 citations


Proceedings ArticleDOI
01 Nov 1974
TL;DR: In this paper, it is shown that if a weaker form of the classical sensitivity theorem of nonlinear programming holds, then directional derivatives and in some cases even ordinary derivatives of the supremal value function exist.
Abstract: This paper solves a particular min-max problem where the minimizing variables not only influence the performance index but also constrain the domain of action of the maximizing variables. The problem is approached as minimization of a supremal value function. It is shown here that if a weaker form of the classical sensitivity theorem of non-linear programming holds, then directional derivatives, and in some cases even ordinary derivatives, of the supremal value function exist. The existence of ordinary derivatives is especially useful for computation purposes, in that case the min-max becomes a stationary point under equality constraints with respect to both players separately. The problem originates from a new approach for control design of a dynamic uncertain system when a bound in norm of the approximation error is guaranteed.

5 citations


Book ChapterDOI
01 Jul 1974
TL;DR: In this paper, the formal procedure of obtaining the Bellman equation in the problem of heat conductivity control is described, and the problem is formulated as a convex optimization problem.
Abstract: The formal procedure of obtaining Bellman equation in the problem of heat conductivity control is stated in [1].

1 citations


Book ChapterDOI
TL;DR: This chapter presents a conversion of the game situation to a mathematical programming problem, and demonstrates that even for simple situations, exact analytical solutions are difficult or impossible to obtain.
Abstract: Publisher Summary The advent of computers has allowed the building of experience through simulation of the competition and has opened the possibility of evaluating conflicts numerically. The latter possibility brings the logic of mathematics to bear on important events, which were intuitively evaluated. A branch of mathematics, called game theory, is concerned with finding solutions to mathematical models of conflict, competition, and cooperation. The approximate numerical methods are the most reasonable approach to use for real game situations. This chapter presents a conversion of the game situation to a mathematical programming problem, and demonstrates that even for simple situations, exact analytical solutions are difficult or impossible to obtain. The principle of optimality and the method of dual cones were brought together and extended such that it is possible to synthesize approximate numerical solutions to an important class of dynamic games. The dynamic games are viewed as a sequence of parametrized static games, each of which can be solved as a parametrized mathematical programming problem.

Book ChapterDOI
01 Jul 1974
TL;DR: In this article, the problem of synthesis of optimum control with quadratic criterion of optimum is solved with the help of Bellman equation, which can be easily generalized to the nonlinear boundary-value problem for integral-differential equation.
Abstract: The formal procedure of obtaining Bellman equation in the problem of heat conductivity control is stated in [1] Here we show how the problem of synthesis of optimum control with quadratic criterion of optimum is solved with the help of this equation For simplification of formulas we used the simplest example which can be easily generalized It should he noted, that during the solution the nonlinear boundary-value problem for integral-differential equation is derived, which is infinite dimensional analog of well-known Rikkati equation for finite dimensional systems