scispace - formally typeset
Search or ask a question

Showing papers on "Bellman equation published in 1977"


Journal ArticleDOI
TL;DR: An abstract model is proposed for the problem of optimal control of systems subject to random perturbations, for which the principle of optimality takes on an appealing form and the additional structure permits operationally useful optimality conditions.
Abstract: The paper proposes an abstract model for the problem of optimal control of systems subject to random perturbations, for which the principle of optimality takes on an appealing form. This model is specialized to the case where the state of the controlled system is realized as a jump process. The additional structure permits operationally useful optimality conditions. Some illustrative examples are solved.

117 citations


Journal ArticleDOI
TL;DR: In this paper, a Banach space with a weighted supremum norm is introduced to guarantee convergence of successive approximations to the value function, which are weaker then those required by the usual supnorm approach.

87 citations


Journal ArticleDOI
TL;DR: In this paper, two methods are proposed for successive approximations of the value function of a pursuit game with limited time and with payoff function minτϵ[0, t] H (x (τ), y (τ)), which are used directly for the constructive design of successive pursuit and evasion strategies which permit e-optimal strategies to be found in any e > c 0.

24 citations


Proceedings ArticleDOI
04 May 1977
TL;DR: It is shown that nonserial dynamic programming is optimal among one class of algorithms for an important class of discrete optimization problems, and the results' strong implications for choosing deterministic, adaptive, and nondeterministic algorithms for the optimization problem.
Abstract: We show that nonserial dynamic programming is optimal among one class of algorithms for an important class of discrete optimization problems. We consider discrete, multivariate, optimization problems in which the objective function is given as a sum of terms. Each term is a function of only a subset of the variables. We first consider a class of optimization algorithms which eliminate suboptimal solutions by comparing the objective function on “comparable” partial solutions. A large, natural subclass of comparison algorithms in which the subproblems considered are either nested or nonadjacent (i.e., noninteracting) is then defined. It is shown that a variable-elimination procedure, nonserial dynamic programming, is optimal in an extremely strong sense among all algorithms in the subclass. The results' strong implications for choosing deterministic, adaptive, and nondeterministic algorithms for the optimization problem, for defining a complexity measure for a pattern of interactions, and for describing general classes of decomposition procedures are discussed. Several possible extensions and unsolved problems are mentioned.

11 citations


Journal ArticleDOI
Robert Janin1
TL;DR: In this article, the authors consider a mathematical programming problem which depends on certain parameters and study the sensitivity of the optimal value function when the parameters of the problem vary; they call this mapping the value function.

8 citations


Journal ArticleDOI
TL;DR: Optimal immunization policies together with corresponding optimal costs are presented for the Greenwood and Reed-Frost chain-binomial epidemic models.
Abstract: A policy of immunization is considered for the Greenwood and Reed-Frost chain-binomial epidemic models. In the case of the Reed-Frost model the assumption is made that the probability of contact between two individuals is either small or large. The principle of optimality is used to derive dynamic programming equations for both deterministic and stochastic versions of these models. Optimal immunization policies together with corresponding optimal costs are presented.

6 citations


Journal ArticleDOI
K. Vit1
TL;DR: Two computational algorithms are described, one of which uses only integration of a system of differential equations with specified initial conditions and numerical minimization in finite-dimensional space, and the other is based on the differential dynamic programming approach.
Abstract: The dynamic programming formulation of the forward principle of optimality in the solution of optimal control problems results in a partial differential equation with initial boundary condition whose solution is independent of terminal cost and terminal constraints. Based on this property, two computational algorithms are described. The first-order algorithm with minimum computer storage requirements uses only integration of a system of differential equations with specified initial conditions and numerical minimization in finite-dimensional space. The second-order algorithm is based on the differential dynamic programming approach. Either of the two algorithms may be used for problems with nondifferentiable terminal cost or terminal constraints, and the solution of problems with complicated terminal conditions (e.g., with free terminal time) is greatly simplified.

6 citations


Journal ArticleDOI
TL;DR: In this paper, the authors consider a stationary dynamic program with general state and action spaces and with an unbounded reward function and derive necessary and sufficient conditions for the validity of Howard's policy improvement method.
Abstract: We consider a stationary dynamic program with general state and action spaces and with an unbounded reward function. Taking a martingale approach to the optimization problem we derive several necessary and sufficient conditions for the validity of Howard's policy improvement method. The conditions hold both in the positive and negative ease. By means of these results we can construct a sequence of stationary policies for which the expected rewards converge to the value function. The construction is a straightforward generalization of the method given by Frid [3].

1 citations


01 Jan 1977
TL;DR: In this paper, a broad class of inverse theorems on mathematical programming problems is given, where the objective function is either a recursive function with strict ID-creasingness or a recursive functions with strict decreasing-ness, and so is the constraint function.
Abstract: The author gives a broad class of inverse theorems on mathematical programming problems, where the objective function is either a recursive function with strict IDcreasingness or a recursive function with strict decreasing­ ness, and so is the constraint function. It is also shown that the optimal-value functions of main and inverse problems can be expressed by the successive use of some nonlinear operators dermed in this paper. Each expression is based upon either BeJlman's Principle of Optimality or its modified principle. Further each inverse theorem accompanies an example.

Journal ArticleDOI
S.P. Bansal1
TL;DR: In this paper, a situation is considered where a constraint is such that no two consecutive arcs in the path are traversed by the same agency, and functional equations using Bellman's principle of optimality are developed and solved.