scispace - formally typeset
Search or ask a question

Showing papers on "Bellman equation published in 1976"


Journal ArticleDOI
TL;DR: In this article, the authors give rigorous proofs for the justification of optimality equations and for the effectiveness of the principle with two meanings without assuming the existence of maximum values of returns.
Abstract: Dynamic programming (DP) has been introduced by R. Bellman [2] as an important technique to solve non-linear programming problems in which a sequence of decisions has to be chosen in an optimal manner. Bellman, in his book, proposed "Principle of Optimality" to show that th e determination of an optimal policy can be reduced to the solution of an optimality equation, i. e., a functional equation that should be satisfied by an optimal return. Although Principle of Optimality is a proposition which needs mathematical reasoning, his justification for the principle was not in a precise mathematical form. For this reason, the scope of cost structure to which the principle is applicable has been left unexplained. Afterward, G. L. Nemhauser [9] gave a sufficient condition for the cost structure in order that an optimality equation holds true. His condition is that the cost function should have both a separability property and a monotonicity property. Nemhauser did not make explicit the relation between the effectiveness of Bellman's principle and the justification for an optimality equation — the relation is no more trivial under his condition. In this paper we shall be concerned with the optimization of finite-stage sequential decision processes. We shall give rigorous proofs for the justification of optimality equations and for the effectiveness of optimality principles with two meanings , without assuming the existence of maximum values of returns. Our condition is that the cost function should have a recursiveness property, a monotonicity property and a Lipschitz condition. Our recursiveness is essentially same as the separability in Nemhauser sense. Our monotonicity has two senses : one is a wide sense, and the other a strict sense. The monotonicity properties in the wide and the strict senses, together with the recursiveness and the Lipschitz condition, induce optimality principles in a weak and a strong senses, respectively. Bellman's Principle of Optimality is well to be identified, in our terms, a principle in the strong sense. Our principle in the weak sense has not been introduced in other literatures as far as the authors know. If we assume the existence of maximum values of returns like Nemhauser did, then the Lipschitz condition can be suppressed from hypotheses in our arguments. In this paper we treat both deterministic and stochastic cases. Section 2 is

21 citations


Book ChapterDOI
Makiko Nisio1
01 Jan 1976

17 citations


Journal ArticleDOI
TL;DR: In this article, a new concept of optimality in zero-sum, two-player differential games is presented and studied, which consists of a local semipermeable condition and a global reprisal condition.
Abstract: A new concept of optimality in zero-sum, two-player differential games ia presented and studied. This concept consists of a ’ local semipermeable condition ’ and a ’ global reprisal condition ’. The latter is introduced to overcome certain difficulties which arise in differential games in which there exist strategy pairs which yield paths which do not meet the terminal surfaco (or set) It is shown that when specified conditions are satisfied, distinct optimal strategies yield the same value function, optimal strategies are interchangeable, and the resulting value function satisfies Isaacs’ equation. Furthermore, in games of finite duration or other ’ all-terminating ’ games, this concept reduces to tho classical saddle-point formulation.

4 citations


20 Apr 1976
TL;DR: In this paper, the first derivatives of a Kuhn-tucker triple are derived for a nonlinear programming problem with perturbations in the right-hand side of the constraints.
Abstract: : The paper first presents a brief historical survey of the introduction of Lagrange multipliers in characterizing optimality and duality in mathematical programming. Attention is focused on the interpretation of optimal Lagrange multipliers as a first-order measure of the sensitivity of the optimal value function of the problem with right-hand side perturbations of the constraints. For the latter problem, explicit formulas are then obtained for calculating the first derivatives of a Kuhn-Tucker triple, resulting in second-order characterizations of the optimal value function. Approximation formulas are developed for the algorithm based on the logarithmic-quadratic penalty function. Applications are indicated, e.g., in obtaining sharper estimates of the optimal value of a problem with different constraint right-hand sides, in applying a well known approach to solving a class of large-scale decomposable nonlinear programming problems, and in supplementing the rich theoretical developments associated with a first-order analysis of the optimal value function of the problem with perturbations in the right-hand sides of the constraints. (Author)

2 citations


Journal ArticleDOI
TL;DR: In this paper, the problem of maximizing the probability of a controlled system attaining the specified state is considered, and the related Bellman equations and the methods for determining the optimal control are investigated.

1 citations


Journal Article
TL;DR: In this paper, an optimal control is proposed for regulating power distribution in a nuclear power reactor which has cylindrical geometry and the space dependence of the system is described by expanding space depenident variables by Helmholtz modes.
Abstract: An optimal control is giyen for regulating power distribution in a nuclear power reactor which has cylindrical geometry. The space dependence of the system is described by expanding space depenident variables by Helmholtz modes. Results are obtained through the principle of optimality and are described by the Riccati-type algebraic equation that the optimal feedback coefficients should satisfy. Use of an integral equation as the system equation makes it possible to deal with actual controlling apparatuses: control rods or rod clusters.