scispace - formally typeset
Search or ask a question

Showing papers on "Bellman equation published in 1988"


Journal ArticleDOI
TL;DR: In this article, the authors studied the connections between deterministic exit time control problems and possibly discontinuous viscosity solutions of a first-order Hamilton-Jacobi (HJ) equation up to the boundary.
Abstract: The authors study the connections between deterministic exit time control problems and possibly discontinuous viscosity solutions of a first-order Hamilton-Jacobi (HJ) equation up to the boundary. This equation admits a maximum and a minimum solution that are the value functions associated to stopping time problems on the boundary. When these solutions are equal, they can be obtained through the vanishing viscosity method. Finally, when the HJ equation has a continuous solution, it is proved to be the value function for the first exit time of the domain. It is also the vanishing viscosity limit arising, in particular, in some large deviations problems.

251 citations


Journal ArticleDOI
TL;DR: In this paper, a study of differentiability properties of the optimal value function and an associated optimal solution of a parametrized nonlinear program is presented under the Mangasarian-Fromovitz constraint qualification when the corresponding vector of Lagrange multipliers is not necessarily unique.
Abstract: This paper is concerned with a study of differentiability properties of the optimal value function and an associated optimal solution of a parametrized nonlinear program. Second order analysis is presented essentially under the Mangasarian-Fromovitz constraint qualification when the corresponding vector of Lagrange multipliers is not necessarily unique. It is shown that under certain regularity conditions the optimal value function possesses second order directional derivatives and the optimal solution mapping is directionally differentiable. The results obtained are applied to an investigation of metric projections in finite-dimensional spaces.

132 citations


Journal ArticleDOI
TL;DR: In this article, the boundary behavior and optimal portfolio rules for cases when marginal utility at zero consumption is finite are discussed. But they do not satisfy the Hamilton-Jacobi Bellman equations and do not represent appropriate value functions because the boundary behaviour near zero wealth is not satisfactorily dealt with.

78 citations


Book ChapterDOI
01 Jan 1988
TL;DR: In this article, the authors investigated the Bellman equation that arises in the optimal control of Markov processes and derived existence and uniqueness results for the optimal optimal control problem with viscosity solutions.
Abstract: We investigate the Bellman equation that arises in the optimal control of Markov processes This is a fully nonlinear integro-differential equation The notion of viscosity solutions is introduced and then existence and uniqueness results are obtained Also, the connection between the optimal control problem and the Bellman equation is developed

76 citations


Journal ArticleDOI
TL;DR: Holder, Lipschitz and differential properties of the optimal solutions of a nonlinear mathematical programming problem with perturbations in some fixed direction are obtained using an approach based on duality and stability.
Abstract: This paper is concerned with Holder, Lipschitz and differential properties of the optimal solutions of a nonlinear mathematical programming problem with perturbations in some fixed direction. These properties are obtained with virtually minimal regularity conditions using an approach based on duality and stability. The Holder property is used to obtain the directional derivative for the optimal value function.

66 citations


Journal ArticleDOI
Makiko Nisio1
TL;DR: In this paper, the authors extend partially P. L. Lions' results to stochastic differential games, where two players conflict each other, and show that if the value function of stochatic differential game is smooth enough, then it satisfies a second order partial differential equation with max-min or min-max type nonlinearity, called Isaacs equation.
Abstract: Recently P. L. Lions has demonstrated the connection between the value function of stochastic optimal control and a viscosity solution of Hamilton-Jacobi-Bellman equation [cf. 10, 11, 12]. The purpose of this paper is to extend partially his results to stochastic differential games, where two players conflict each other. If the value function of stochatic differential game is smooth enough, then it satisfies a second order partial differential equation with max-min or min-max type nonlinearity, called Isaacs equation [cf. 5]. Since we can write a nonlinear function as min-max of appropriate affine functions, under some mild conditions, the stochastic differential game theory provides some convenient representation formulas for solutions of nonlinear partial differential equations [cf. 1, 2, 3].

51 citations


Journal ArticleDOI
TL;DR: It is proved that the relationship between the Pontryagin maximum principle and dynamic programming, now expressed in terms of the generalized gradient ofV, is established for a large class of nonsmooth problems.
Abstract: The dynamic programming approach to optimal control theory attempts to characterize the value functionV as a solution to the Hamilton-Jacobian-Bellman equation. Heuristic arguments have long been advanced relating the Pontryagin maximum principle and dynamic programming according to the equation (H(t, x*(t), u*(t), p(t)),−p(t))=√V(t,x*(t)), where (x*, u*) is the optimal control process under consideration,p(t), is the coextremal, andH is the Hamiltonian. The relationship has previously been verified under only very restrictive hypotheses. We prove new results, establishing the relationship, now expressed in terms of the generalized gradient ofV, for a large class of nonsmooth problems.

46 citations


Journal ArticleDOI
TL;DR: In this paper, the value function of distributed parameter control problems is shown to be the unique viscosity solution of the corresponding Hamiltonian-Jacobi-Bellman equation, and the main assumption is the existence of an increasing sequence of compact invariant subsets of the state space.
Abstract: This paper is concerned with a certain class of distributed parameter control problems. The value function of these problems is shown to be the unique viscosity solution of the corresponding Hamiltonian-Jacobi-Bellman equation. The main assumption is the existence of an increasing sequence of compact invariant subsets of the state space. In particular, this assumption is satisfied by a class of controlled delay equations.

40 citations


Journal ArticleDOI
TL;DR: In this article, it was shown that in the case of both continuously and impulsively controlled Markov-type stochastic processes, the usual Bellman optimality equation for the value function v turns into a system of the form T1 v≤0, T2 v ≥ 0, T 1 v= 0, where T 1 and T 2 v are respectively a non-differential and a differential nonlinear operators.
Abstract: In the case of both continuously and impulsively controlled Markov-type stochastic processes the usual Bellman optimality equation for the value function vturns into a system of the form T1 v≤0, T2 v≤0, T1 v. T2 v= 0, where T1and T2are, respectively, a non-differential and a differential non-linear operators ‘in the controlled diffusion this system became known as "quasi variational inequalities"». We examine to what extent analogous inequalities are valid in Markov decision deterministic drift processes, in which randomness affects only jumps. As auxiliary results we prove that v is upper semianalytic, that almost everywhere E-optimal policies do exist, and also prove a defect formula for policies which is closely related to the inequalities in consideration. Our investigation is based on the concept of stochastic processes with a randomly split time parameter

30 citations


Journal ArticleDOI
TL;DR: In this paper, the authors studied the finite horizon Bellman equation for controlled Markov jump models with unbounded jump and cost rates and proved the existence of a solution, and constructed a computationally attractive approximation scheme.

26 citations


Journal ArticleDOI
TL;DR: In this paper, Pontryagin et al. consider a classe generale de problemes de commande de type Lagragien afin, qu'associe avec un ensemble de conditions necessaires, une commande soit satisfaite.

Journal ArticleDOI
S. Zlobec1
TL;DR: This paper is a survey of basic results that characterize optimality in single- and multi-objective mathematical programming models and characterizes structural optima, obtaining some new, and recover the familiar, optimality conditions in nonlinear programming.
Abstract: This paper is a survey of basic results that characterize optimality in single- and multi-objective mathematical programming models. Many people believe, or want to believe, that the underlying behavioural structure of management, economic, and many other systems, generates basically ‘continuous’ processes. This belief motivates our definition and study of optimality, termed ‘structural’ optimality. Roughly speaking, we say that a feasible point of a mathematical programming model is structurally optimal if every improvement of the optimal value function, with respect to parameters, results in discontinuity of the corresponding feasible set of decision variables. This definition appears to be more suitable for many applications and it is also more general than the usual one: every optimum is a structural optimum but not necessarily vice versa. By characterizing structural optima, we obtain some new, and recover the familiar, optimality conditions in nonlinear programming. The paper is self-contained. Our approach is geometric and inductive: we develop intiution by studying finite-dimensional models before moving on to abstract situations.

Journal ArticleDOI
TL;DR: In this article, the authors considered differential games with semicontinuous payoff functions and correspondingly semicontincuous solutions and proved existence and uniqueness theorems for these solutions and defined generalized (viscosity) solutions by means of pairs of differential inequalities.

Journal ArticleDOI
TL;DR: Generalized DP is shown both to subsume conventional DP and to provide an alternative means of implementing Mitten's preference order DP, a generalization of DP that produces optimal solutions, even in the absence of monotonicity.
Abstract: Extensions of dynamic programming (DP) into generalized preference structures, such as exist in multicriteria optimization, have invariably assumed sufficient structure to ensure the validity of Bellman's “principle of optimality” or the presence of monotonicity, hence ensuring the validity of the functional equations of DP. Often, however, the only reasonable formulation of problem (in terms of the size of the resulting state space) is one for which neither monotonicity nor the (strong) principle of optimality is satisfied. Generalized DP, based on a weaker principle of optimality, is a generalization of DP that produces optimal solutions, even in the absence of monotonicity. It is shown both to subsume conventional DP and to provide an alternative means of implementing Mitten's preference order DP. This paper provides an overview of generalized DP and its applications to date, summarizing our computational experiences in the process.

Journal ArticleDOI
TL;DR: In this paper, the optimal control of diffusion processes on the infinite time interval is studied and the Bellman equation for the problem is given an interpretation connected with the overtaking optimality notion.
Abstract: The optimal control of diffusion processes on the infinite time interval are studied. All the costs diverge to infinity and we employ the overtaking criterion, as well as considering minimal growth rate controls. The Bellman equation for the problem is considered and its solution is given an interpretation connected with the overtaking optimality notion. Control problems with a cost including a generalized discount factor (which is not integrable on [(, ∞)) are also studied. Both cases, where the diffusion is inR n or where it is reflected from the boundary of a bounded set, are considered.

Journal ArticleDOI
TL;DR: In this article, a finite collection of piecewise-deterministic processes are controlled in order to minimize the expected value of a performance functional with continuous operating cost and discrete switching control costs.
Abstract: A finite collection of piecewise-deterministic processes are controlled in order to minimize the expected value of a performance functional with continuous operating cost and discrete switching control costs. The solution of the associated dynamic programming equation is obtained by an iterative approximation using optimal stopping time problems.

Journal ArticleDOI
01 Jun 1988
TL;DR: Using homogenization theory, the authors treated the problem of controlled diffusions in a random medium with rapidly varying composition, which involves homogenisation of a nonlinear Bellman dynamic program.
Abstract: Using homogenization theory we treat the problem of controlled diffusions in a random medium with rapidly varying composition. This involves homogenization of a nonlinear Bellman dynamic programmin...

Proceedings ArticleDOI
07 Dec 1988
TL;DR: The numerical solution of an optimal correction problem for a damped random linear oscillator for a dynamic programming equation and a correction-type algorithm based on a discrete maximum principle are introduced to ensure the convergence of the iteration procedure.
Abstract: The numerical solution of an optimal correction problem for a damped random linear oscillator is studied. A numerical algorithm for the discretized system of the associated dynamic programming equation is given. To initiate the computation, a numerical scheme derived from the deterministic version of the problem is adopted. A correction-type algorithm based on a discrete maximum principle is introduced to ensure the convergence of the iteration procedure. >

Journal ArticleDOI
TL;DR: In this article, the existence of a solution of the first boundary value problem for a degenerate Bellman equation was proved for a strictly convex region of class with zero data on the boundary.
Abstract: This article is devoted to a proof of a general theorem on the existence of a solution of the first boundary value problem for a degenerate Bellman equation. In contrast to other papers the nonlinearity of the equation is used here and leads, for example, to a proof of solvability of the simplest Monge-Ampere equation for , in a strictly convex region of class with zero data on the boundary. Bibliography: 18 titles.

Journal ArticleDOI
TL;DR: The numerical solution of an optimal correction problem for a damped random linear oscillator and a correction-type algorithm based on a discrete maximum principle is introduced to ensure the convergence of the iteration procedure.
Abstract: The numerical solution of an optimal correction problem for a damped random linear oscillator is studied. A numerical algorithm for the discretized system of the associated dynamic programming equation is given. To initiate the computation, we adopt a numerical scheme derived from the deterministic version of the problem. Next, a correction-type algorithm based on a discrete maximum principle is introduced to ensure the convergence of the iteration procedure.

Journal ArticleDOI
TL;DR: In this paper, an iterative algorithm based on a priori deduction from Bellman's principle of optimality was developed for solving eigenvalue problems, where the set of admissible states was restricted only to those stages that are "near" to the nominal trajectory.
Abstract: This paper develops an iterative algorithm, based on a priori deduction from Bellman's principle of optimality, for solving eigenvalue problems. During each iteration, the set of admissible states is restricted only to those stages that are “near” to the nominal trajectory. The algorithm is shown to use only minimal storage requirements. The significance of the method is that it provides a means of reducing Bellman's “curse of dimensionality” and broadens the scope of problems that can effectively be solved with the dynamic programming approach. The technique is then applied to evaluate the smallest eigenvalue for the differential equation arising in the mathematical modelling of the desorption from a liquid film.

Journal ArticleDOI
TL;DR: In this article, the authors provided a turnpike-like theorem for multidimensional, optimal-growth models, which holds for evey level of the discount factor, and showed that when the short-run return function of the reduced-form model satisfies a certain sufficient condi cation, then the resulting dynamics is of a simple type.
Abstract: This paper provides a turnpike-like theorem for multidimensional, optimal-growth models, which holds for evey level of the discount factor. It is shown that when the short-run return fu nction of the reduced-form model satisfies a certain sufficient condi tion, then the resulting dynamics is of a simple type, i.e., it must converge to some steady state. The result is obtained in two steps: f irst it is shown that dynamical systems satisfying an acyclic binary relation must be simple; second, the value function for the optimal p roblem is used to define a binary relation on the space of feasible s tates. The necessary and sufficient conditions under which the latter binary relation is acyclic are provided, and their relation to the t echnology and preferences is outlined. Copyright 1988 by Economics Department of the University of Pennsylvania and the Osaka University Institute of Social and Economic Research Association.

Journal ArticleDOI
TL;DR: In this article, the authors show that the infinite-horizon value function for a linear/quadratic Markov decision process by policy improvement is exactly equivalent to solution of the equilibrium Riccati equation by the Newton-Raphson method.
Abstract: We show that the calculation of the infinite-horizon value function for a linear/quadratic Markov decision process by policy improvement is exactly equivalent to solution of the equilibrium Riccati equation by the Newton-Raphson method. The assertion extends to risk-sensitive and non-Markov forinulations and thus shows, for example, that the Newton-Raphson method provides an iterative algorithm for the canonical factorization of operators which shows second-order convergence and has a variational basis.

Journal ArticleDOI
TL;DR: In this paper, the optimal value function (cost-to-go) belongs to a parametrized class of functions that remains invariant under the dynamic programming operator, where the value function is a quadratic function.
Abstract: Using logarithmic transformations, we construct discrete-time stochastic control problems where the optimal value function (cost-to-go) belongs to a same parametrized class of functions that remains invariant under the dynamic programming operator. This extends a well-known property of the classical LQG problems, where the optimal value function is a quadratic. Some related questions are also discussed.


Journal ArticleDOI
TL;DR: In this paper, the authors present an approach to determine the initially unspecified weights in an additive measurable multiattribute value function, which is based on nonlinear programming problems which incorporate partial information concerning the attribute weights or overall relative value of alternatives the decision maker chooses to provide.
Abstract: In this article we present an approach to determine the initially unspecified weights in an additive measurable multiattribute value function We formulate and solve a series of nonlinear programming problems which (1) incorporate whatever partial information concerning the attribute weights or overall relative value of alternatives the decision maker chooses to provide, yet (2) yield a specific set of weights as a result Although each formulation is rather easily solved using the nonlinear programming software GINO (general interactive optimizer), solutions in closed form dependent on a single parameter are also provided for a number of these problems


Posted Content
01 Jan 1988
TL;DR: This paper shows how the theory of dynamic duality can be worked out in dynamic constrained optimization problems and a family of functions which satisfy the properties of the corresponding value function is provided.
Abstract: It Is Possible to Think of Numerous Economic Problems Involving Dynamic Constrained Optimization Where Dual Representation Exist and Possibly Can Be Characterized. One Example Arises in the Theory of Dynamic Factor Demand And, More Generally, Investment Decisions. Its Duality Structure Has Been Studied by Epstein in Several Papers. the Extraction of Non-Renewable Resources Is an Investment Problem And, Furthermore Specific Boundary Conditions Must Be Satisfied At the Date of Exhaustion If It Occurs in a Finite Time. This Paper Shows How the Theory of Dynamic Duality Can Be Worked Out in Such Problems. a Family of Functions Which Satisfy the Properties of the Corresponding Value Function Is Also Provided. While Natural Resource Extraction Is a Prime Candidate for Empirical Applications, Other Applications Could Be Carried Out in Such Areas As R&D, the Management of Informaiton, and Advertising.

Journal ArticleDOI
TL;DR: In this paper, the theory of dynamic duality is applied to the problem of non-renewable resource extraction in the context of investment decisions, and a family of functions which satisfy the properties of the corresponding value function is provided.

Proceedings ArticleDOI
07 Dec 1988
TL;DR: The authors formulate the optimal control problem for queueing models in a unified way, using abstract linear programming, showing how certain properties of the optimal policy can be easily derived, even in cases where dynamic programming (DP) and stochastic dominance (SD) arguments fail.
Abstract: For a significant number of queueing models that appear in diverse, seemingly unrelated application areas, such as routing, resource allocation and flow control, the optimal policy exhibits a certain switching-curve structure. The authors formulate the optimal control problem for such models in a unified way, using abstract linear programming. Using well-known facts from sensitivity analysis of linear programs, they show how certain properties of the optimal policy can be easily derived, even in cases where dynamic programming (DP) and stochastic dominance (SD) arguments fail. A structural property of the optimal value function of the linear program, namely piecewise linearity, is exploited to derive properties of the optimal cost function. The authors also consider additional problems in the realm of queueing system control in which DP or SD approaches are not applicable but linear programming can provide useful results. >