scispace - formally typeset
Search or ask a question

Showing papers on "Bellman equation published in 1985"


Journal ArticleDOI
TL;DR: In this paper, a linear superposition of M basis functions is proposed to fit the value function in a Markovian decision process by reducing the problem dimensionality from the number of states down to M.

385 citations


Journal ArticleDOI
TL;DR: In this paper, a discrete-time controlled Markov process subject to a time-average cost constraint is maximized over the class of al causal policies by using a Lagrange multiplier formulation involving the dynamic programming equation.

249 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that the value function is the viscosity solution of the associated Hamilton-Jacobi-Bellman and Isaacs equations, and that the subsolution of these equations must satisfy some inequalities called super- and subdynamic programming principle respectively.
Abstract: Recent work by the authors and others has demonstrated the connections between the dynamic programming approach to optimal control theory and to two-person, zero-sum differential games problems and the new notion of “Viscosity” solutions of Hamilton–Jacobi PDE’s introduced by M. G. Crandall and P.-L. Lions. In particular, it has been proved that the dynamic programming principle implies that the value function is the viscosity solution of the associated Hamilton–Jacobi–Bellman and Isaacs equations. In the present work, it is shown that viscosity super- and subsolutions of these equations must satisfy some inequalities called super- and subdynamic programming principle respectively. This is then used to prove the equivalence between the notion of viscosity solutions and the conditions, introduced by A. Subbotin, concerning the sign of certain generalized directional derivatives.

140 citations


Journal ArticleDOI
D. Verms1
TL;DR: In this article, the trajectories of piecewise deterministic Markov processes are solved by an ordinary (vector)differential equation with possible random jumps between the different integral curves, and both continuous deterministic motion and the random jumps of the processes are controlled in order to minimize the expected value of a performance functional consisting of continuous, jump and terminal costs.
Abstract: The trajectories of piecewise deterministic Markov processes are solutions of an ordinary (vector)differential equation with possible random jumps between the different integral curves. Both continuous deterministic motion and the random jumps of the processes are controlled in order to minimize the expected value of a performance functional consisting of continuous, jump and terminal costs. A limiting form of the Hamilton-Jacobi-Bellman partial differential equation is shown to be a necessary and sufficient optimality condition. The existence of an optimal strategy is proved and acharacterization of the value function as supremum of smooth subsolutions is also given. The approach consists of imbedding the original control problem tightly in a convex mathematical programming problem on the space of measures and then solving the latter by dualit

118 citations


Journal ArticleDOI
TL;DR: The optimal cost is obtained, considered as the maximum element of a suitable set of subsolutions of the associated Hamilton–Jacobi equation, using an approximation method using a particular derivative discretization scheme employed.
Abstract: We study deterministic optimal control problems having stopping time, continuous and impulse controls in each strategy.We obtain the optimal cost, considered as the maximum element of a suitable set of subsolutions of the associated Hamilton–Jacobi equation, using an approximation method. A particular derivative discretization scheme is employed.Convergence of approximate solutions is shown taking advantage of a discrete maximum principle which is also proved.For the numerical solutions of approximate problems we use a method of relaxation type. The algorithm is very simple; it can be run on computers with small central memory.In Part I we study the stationary case, in Part II [SIAM J. Control Optim., 23 (1985), pp. 267–285] we study the nonstationary case.

105 citations


Journal ArticleDOI
TL;DR: The paper deals with the network optimization problem of minimizing regular project cost subject to an arbitrary precedence relation on the sets of activities and to arbitrarily many resource constraints via a purely structural approach that considerably extends the disjunctive graph concept.
Abstract: The paper deals with the network optimization problem of minimizing regular project cost subject to an arbitrary precedence relation on the sets of activities and to arbitrarily many resource constraints. The treatment is done via a purely structural approach that considerably extends the disjunctive graph concept. It is based on so-called feasible posets and includes a quite deep and useful representation theorem. This theorem permits many insights concerning the analytical behaviour of the optimal value function, the description and counting of all essentially different optimization problems, the nature of Graham anomalies, connections with the on-line stochastic generalizations, and several others. In addition, it also allows the design of a quite powerful class of branch-and-bound algorithms for such problems, which is based on an iterative construction of feasible posets. Using so-called distance matrices, this approach permits the restriction of the exponential part of the algorithm to the often comparatively small set of ‘resource and cost essential’ jobs. The paper reports on computational experience with this algorithm for examples from the building industry and includes a rough comparison with the integer programming approach by Talbot and Patterson.

89 citations



Journal ArticleDOI
TL;DR: In this paper, the optimal stopping time problem for piecewise deterministic processes with deterministic dynamics between random jumps is studied, and W4,00-existence results and probabilistic representations for the solutions of the problem in bounded domains and in R are given.
Abstract: This paper concerns the optimal stopping time problem for a piecewise deterministic process. The process has deterministic dynamics between random jumps. The as¬sociated dynamic programming equation is a variational inequality with integral and (first order) differential terms. Our main results are W4,00-existence results and probabilistic representations for the solutions of the optimal stopping time problem in bounded domains and in R. We also generalize these results to the case when the state space is “countable folds” of Euclidean space

44 citations


Book ChapterDOI
01 Jan 1985
TL;DR: In the analysis of parametric optimization problems, it is of great interest to explore certain stability properties of the optimal value function and the optimal set mapping (or some selection function of this mapping): continuity, smoothness, directional differentiability, Lipschitz continuity and the like as mentioned in this paper.
Abstract: In the analysis of parametric optimization problems it is of great interest to explore certain stability properties of the optimal value function and of the optimal set mapping (or some selection function of this mapping): continuity, smoothness, directional differentiability, Lipschitz continuity and the like. For a survey of this field we refer to comprehensive treatments of various aspects of such questions in the recent works of Fiacco (1983), Bank et al. (1982) and Rockafellar (1982).

41 citations


Journal ArticleDOI
TL;DR: Bellman's principle of optimality is valid with respect to maximal returns and it leads to an algorithm to approximate these returns that is significantly greater effort than to greatest returns.
Abstract: A general sequential model is defined where returns are in a partially ordered set. A distinction is made between maximal (nondominated) returns and greatest returns. We show that Bellman's principle of optimality is valid with respect to maximal returns and it leads to an algorithm to approximate these returns. We argue that significantly greater effort is needed to apply this algorithm to maximal returns than to greatest returns.

32 citations


01 Dec 1985
TL;DR: The extension allows value function separability and the principle of optimality to be captured in an influence diagram and then used in analysis and to accomplish this, the concept of a subvalue node has been developed.
Abstract: : This thesis addresses the problem of extending influence diagram theory such that decision processes can be effectively modeled within this graphical modeling language Specifically, the extension allows value function separability and the principle of optimality to be captured in an influence diagram and then used in analysis To accomplish this, the concept of a subvalue node has been developed The set of value preserving operations on influence diagrams have been expanded to include operations that exploit the presence of these nodes Also an algorithm has been developed to solve influence diagrams with subvalue nodes This work is important from two perspectives From the decision analysis perspective, it allows a full and simple exploitation of all separability in the value function of a decision problem Importantly, this means that algorithms can be designed to solve influence diagrams that automatically recognize the opportunity for applying the principle of optimality From the decision processes perspective, influence diagrams with subvalue nodes allow efficient formulation and solution of nonstandard decision processes Also it allows the conditional independence among the variables in the problem to be exploited This significantly reduces the data storage requirements and computational complexity of solving the problem Finally, the influence diagram with subvalue nodes enhances understanding of many of the critial characteristics of various decision processes

Journal ArticleDOI
TL;DR: In this article, an existence result on the Bellman equation related to an infinite dimensional control problem was given for an infinite-dimensional control problem, and the existence result was proved.
Abstract: We give an existence result on the Bellman equation related to an infinite dimensional control problem.

Journal ArticleDOI
Sanjo Zlobec1
TL;DR: This work studies when output is a continuous function of input and identifies optimal (minimal) realizations of mathematical models, states of the model having the property that every stable perturbation of input results in a locally worse value of the optimal value function.
Abstract: Mathematical models are considered as input-output systems. The input is data (technological coefficients, available energy, prices) and the output is the feasible set, the set of optimal solutions, and the optimal value. We study when output is a continuous function of input and identify optimal (minimal) realizations of mathematical models. These are states of the model having the property that every stable perturbation of input results in a locally worse (higher) value of the optimal value function. In input optimization we “optimize” mathematical model rather than a specific mathematical program.

Journal ArticleDOI
TL;DR: In this paper, the authors considered the discounted and ergodic optimal control problems related to a one-dimensional storage process and established the existence and uniqueness of the corresponding Bellman equation and the regularity of the optimal value.
Abstract: We consider the discounted and ergodic optimal control problems related to a one-dimensional storage process. The existence and uniqueness of the corresponding Bellman equation and the regularity of the optimal value is established. Using the Bellman equation an optimal feedback control is constructed. Finally we show that under this optimal control the origin is reachable.


Journal ArticleDOI
TL;DR: In this paper, an interactive procedure based on Box's complex search is used to solve the vector maximization problem, which has the advantage that the decision maker's underlying value function need not be explicitly specified.



Journal ArticleDOI
TL;DR: In this paper, the optimal control of stochastic dynamic models with partially observable coefficients is derived by means of the stochiastic Bellman equation. But the control of a linear system with random coefficients is not studied.
Abstract: The control of a linear system with random coefficients is studied here. The cost function is of a quadratic form and the random coefficients are assumed to be partially observable by the controller. By means of the stochastic Bellman equation, the optimal control of stochastic dynamic models with partially observable coefficients is derived. The optimal control is shown to be a linear function of the observable states and a nonlinear function of random parameters. The theory is applied to an optimal control design of an aircraft landing in wind gust.

Book ChapterDOI
01 Jan 1985

Book ChapterDOI
01 Jan 1985
TL;DR: In this article, it was shown that for the case where more than one continue action is available, there is a sufficient condition for the existence of a stationary optimal policy in a leavable gambling house with a compact action space.
Abstract: It is well-known that for the problem of stopping a Markov chain with finite state space there exists an optimal a.s. finite stopping time which is the entrance time into the set where the value function coincides with the utility function. In this paper, this result is extended to the case where more than one continue action is available. The result of the paper also yields a sufficient condition for the existence of a stationary optimal policy in a leavable gambling house with a compact action space.

Journal ArticleDOI
TL;DR: In this article, it is shown that even if the primal problem or the dual problem has an unbounded optimal solution set, the nature of the values off at points near a given point can be investigated.
Abstract: Often, the coefficients of a linear programming problem represent estimates of true values of data or are subject to systematic variations. In such cases, it is useful to perturb the original data and to either compute, estimate, or otherwise describe the values of the functionf which gives the optimal value of the linear program for each perturbation. If the right-hand derivative off at a chosen point exists and is calculated, then the values off in a neighborhood of that point can be estimated. However, if the optimal solution set of either the primal problem or the dual problem is unbounded, then this derivative may not exist. In this note, we show that, frequently, even if the primal problem or the dual problem has an unbounded optimal solution set, the nature of the values off at points near a given point can be investigated. To illustrate the potential utility of our results, their application to two types of problems is also explained.

Book ChapterDOI
01 Jan 1985
TL;DR: A well-known sufficient condition for optimality is expressed in terms of a continuously differentiable function which is a solution to the Hamilton-Jacobi equation of Dynamic Programming.
Abstract: A well-known sufficient condition for optimality is expressed in terms of a continuously differentiable function which is a solution to the Hamilton-Jacobi equation of Dynamic Programming. (A function which serves this purpose is called a Caratheodory function.) However a continuously differentiable may fail to exist, and this limits the usefulness of the condition as classically formulated. Here we ask, how might the condition be modified to extend its applicability? Emphasis is given to problems involving terminal constraints on the trajectories. These pose a special challenge since there is no obvious candidate for a Caratheodory function; we must surmise its existence from abstract arguments, or construct it as the value function of an auxiliary problem. Some interesting connections are made with the theory of necessary conditions.

01 Jan 1985
TL;DR: In this paper, a finite collection of piecewise-deterministic processes are controlled in order to minimize the expected value of a performance functional with continuous and switching control costs, and the solution of the associated dynamic programming equation is obtained by an iterative approximation using optimal stopping time problems.
Abstract: A finite collection of piecewise-deterministic processes are controlled in order to minimize the expected value of a performance functional with continuous and switching control costs. The solution of the associated dynamic programming equation is obtained by an iterative approximation using optimal stopping time problems.


Posted Content
TL;DR: In this paper, it was shown that the monotonicity property of optimal paths (or, equivalently, the uniform boundedness of the marginal propensity of consumption by unity) is a necessary condition for local (as well as for global) optimality, and also sufficient for local optimality but not for global optimality.
Abstract: We show that the monotonicity property of optimal paths (or, equivalently, the uniform boundedness of the marginal propensity of consumption by unity) is a necessary condition for local (as well as for global) optimality, and is also sufficient for local optimality, but not for global optimality. We also show that the well-known properties of the value function -- continuity and monotonicity -n are sufficient (along with the above conditions) to guarantee global optimality. In other words, if at any stock level, a local non-global maximizer is selected, a discontinuity in the value function will be observed. We suggest that the previous literature on this problem has not distinguished between local and global maxima, and consequently has not attempted to derive conditions that uniquely characterize global optimality. This is the major aim of this paper, and we hope to have provided some insight towards a systematic approach to non-convex dynamic optimization.



Journal ArticleDOI
TL;DR: In this paper, the global existence of a point wise solution to the Hamilton-Jacobi equation for totally observed controlled diffusions in Hilbert spaces is proved by studying the corresponding control problem.
Abstract: The global existence of a point wise solution to the Hamilton-Jacobi equation for totally observed controlled diffusions in Hilbert spaces is proved by studying the corresponding control problem. The optimality principle for the control problem leads to local results, whilst an a priori bound is achieved by introducing a secondary minimization problem.

Journal ArticleDOI
TL;DR: In this article, necessary conditions for optimality without the need of differentiability of the righth-hand side of the Bellman equation were derived for the case of optimal quick operation.
Abstract: We know that the Bellman equation in the theory of optimal processes requires highly restrictive assumptions, not fulfilled in many simple problems (see, e.g., [i, p. 85]), for its derivation. However, it often turns out that the Pontryagin maximum principle is applicable in these cases. But the maximum principle requires differentiability of the righthand side of the equation # ==:f(t, x, ~) with respect to x. Recently, necessary conditions for optimality have been obtained without the requirement of differentiability, but under the Lipschitz condition, on the basis of an idea of Clarke [2]. Here we give derivation of necessary conditions for optimality in a direction, developing Bellman's idea, but in a somewhat different form and under other conditions that cover a quite wide class of problems, in particular, certain problems without the Lipschitz condition. For simplicity, we consider the case of optimal quick operation.