scispace - formally typeset
Search or ask a question

Showing papers on "Bellman equation published in 1992"


Journal ArticleDOI
TL;DR: In this article, a stochastic differential formulation of recursive utility is given sufficient conditions for existence, uniqueness, time consistency, monotonicity, continuity, risk aversion, concavity, and other properties.
Abstract: A stochastic differential formulation of recursive utility is given sufficient conditions for existence, uniqueness, time consistency, monotonicity, continuity, risk aversion, concavity, and other properties. In the setting of Brownian information, recursive and intertemporal expected utility functions are observationally distinguishable. However, one cannot distinguish between a number of non-expected-utility theories of one-shot choice under uncertainty after they are suitably integrated into an intertemporal framework. In a "smooth" Markov setting, the stochastic differential utility model produces a generalization of the Hamilton-Bellman-Jacobi characterization of optimality. A companion paper explores the implications for asset prices. Copyright 1992 by The Econometric Society.

1,040 citations


Journal ArticleDOI
Lihe Wang1
TL;DR: The theory of viscosity solutions as mentioned in this paper also applies to fully nonlinear equations (in which even the second order derivatives can enter in nonlinear fashion). Solutions produced by the theory are guaranteed to be continuous, but not necessarily smooth.
Abstract: Recently M. Crandall and P. L. Lions [3] developed a very successful method for proving the existence of solutions of nonlinear second-order partial differential equations. Their method, called the theory of viscosity solutions, also applies to fully nonlinear equations (in which even the second order derivatives can enter in nonlinear fashion). Solutions produced by the viscosity method are guaranteed to be continuous, but not necessarily smooth. Here we announce smoothness results for viscosity solutions. Our methods extend those of [1]. We obtain Krylov-Safonov (i.e. C estimates [8]), C 1 ' " , Schauder (C) and W estimates for viscosity solutions of uniformly parabolic equations in general form. The results can be viewed as a priori estimates on the classical C solutions. Our method produces, in particular, regularity results for a broad new array of nonlinear heat equations, including the Bellman equation [6]:

344 citations


Journal ArticleDOI
TL;DR: In this paper, the authors interpret the following fully nonlinear second-order partial differential equation as the value function of a certain optimal controlled diffusion problem, where is a second order elliptic partial differential operator parametrized by the control variable αϵA: with Here σ,b and c are functions defined on with values respectively in and is a real function defined on.
Abstract: We interpret the following fully nonlinear second-order partial differential equation as the value function of a certain optimal controlled diffusion problem, where is a second order elliptic partial differential operator parametrized by the control variable αϵA: with Here σ,b, and c are functions defined on with values respectively in and is a real function defined on . A particular case of this equation is when . In this case, the equation is the well-known Hamilton-Jacobi-Bellman equation. The problem is formulated as follows: The state equation of the control problem is a classical one. The cost function is described by an adapted solution of a certain backward stochastic differential equation. The paper discusses Bellman's dynamic programming principle for this problem The value function is proved to be a viscosity solution of the above possibly degenerate fully nonlinear equation

328 citations


Journal ArticleDOI
TL;DR: A near-optimal controller design technique is proposed, which provides an approximate numerical solution to the Bellman equation, a tight lower bound for the optimality gap of tractable, near-Optimal controller designs, and a building block for improved,Nearoptimal Controller designs that rely on the decomposition of a multiple part-type problem to smaller (two or three part- type) problems.
Abstract: Dynamic allocation of stochastic capacity among competing activities in a just in time manufacturing environment is addressed by optimal flow control. Optimal policies are characterized by generally intractable Bellman equations. A near-optimal controller design technique is proposed. It provides an approximate numerical solution to the Bellman equation, a tight lower bound for the optimality gap of tractable, near-optimal controller designs, and a building block for improved, near-optimal controller designs that rely on the decomposition of a multiple part-type problem to smaller (two or three part-type) problems. Computational experience is reported for two and three part-type problems.

122 citations



Journal ArticleDOI
TL;DR: The algorithm OPTCON is described which has been developed for the optimal control of nonlinear stochastic models and is applied to obtain approximate numerical solutions of control problems where the objective function is quadratic and the dynamic system is nonlinear.
Abstract: In this paper we describe the algorithm OPTCON which has been developed for the optimal control of nonlinear stochastic models. It can be applied to obtain approximate numerical solutions of control problems where the objective function is quadratic and the dynamic system is nonlinear. In addition to the usual additive uncertainty, some or all of the parameters of the model may be stochastic variables. The optimal values of the control variables are computed in an iterative fashion: First, the time-invariant nonlinear system is linearized around a reference path and approximated by a time-varying linear system. Second, this new problem is solved by applying Bellman's principle of optimality. The resulting feedback equations are used to project expected optimal state and control variables. These projections then serve as a new reference path, and the two steps are repeated until convergence is reached. The algorithm has been implemented in the statistical programming system GAUSS. We derive some mathematical results needed for the algorithm and give an overview of the structure of OPTCON. Moreover, we report on some tentative applications of OPTCON to two small macroeconometric models for Austria.

60 citations


Book
01 Apr 1992
TL;DR: Oege De Moor (1994) Categories, relations and dynamic programming, Mathematical Structures in Computer Science, 4, pp 33­69.
Abstract: Dynamic programming is a strategy for solving optimisation problems. In this paper, we show how many problems that may be solved by dynamic programming are instances of the same abstract specification. This specification is phrased using the calculus of relations offered by topos theory. The main theorem underlying dynamic programming can then be proved by straightforward equational reasoning.The generic specification of dynamic programming makes use of higher-order operators on relations, akin to the fold operators found in functional programming languages. In the present context, a data type is modelled as an initial F-algebra, where F is an endofunctor on the topos under consideration. The mediating arrows from this initial F-algebra to other F-algebras are instances of fold – but only for total functions. For a regular category e, it is possible to construct a category of relations Rel(e). When a functor between regular categories is a so-called relator, it can be extended (in some canonical way) to a functor between the corresponding categories of relations. Applied to an endofunctor on a topos, this process of extending functors preserves initial algebras, and hence fold can be generalised from functions to relations.It is well-known that the use of dynamic programming is governed by the principle of optimality. Roughly, the principle of optimality says that an optimal solution is composed of optimal solutions to subproblems. In a first attempt, we formalise the principle of optimality as a distributivity condition. This distributivity condition is elegant, but difficult to check in practice. The difficulty arises because we consider minimum elements with respect to a preorder, and therefore minimum elements are not unique. Assuming that we are working in a Boolean topos, it can be proved that monotonicity implies distributivity, and this monotonicity condition is easy to verify in practice.

55 citations


Posted Content
TL;DR: In this paper, a singular stochastic control for non-degenerate problems is studied, where the model equation is nonlinear and the cost function need not be convex.
Abstract: This paper is concerned with singular stochastic control for non-degenerate problems. It generalizes the previous work in that the model equation is nonlinear and the cost function need not be convex. The associated dynamic programming equation takes the form of variational inequalities. By combining the principle of dynamic programming and the method of penalization, we show that the value function is characterized as a unique generalized (Sobolev) solution which satisfies the dynamic programming variational inequality in almost everywhere. The approximation for our singular control problem is given in terms of a family of penalized control problems. As a result of such a penalization, we obtain that the value function is also the minimum cost available when only the admissible pairs with uniformly Lipschitz controls are admitted in our cost criterion.

40 citations


Journal ArticleDOI
TL;DR: In this article, an asymptotic analysis of hierarchical production planning in a manufacturing system with serial machines that are subject to breakdown and repair, and with convex costs is presented.
Abstract: This paper presents an asymptotic analysis of hierarchical production planning in a manufacturing system with serial machines that are subject to breakdown and repair, and with convex costs. The machines capacities are modeled as Markov chains. Since the number of parts in the internal buffers between any two machines needs to be non-negative, the problem is inherently a state constrained problem. As the rate of change in machines states approaches infinity, the analysis results in a limiting problem in which the stochastic machines capacity is replaced by the equilibrium mean capacity. A method of “lifting” and “modification” is introduced in order to construct near optimal controls for the original problem by using near optimal controls of the limiting problem. The value function of the original problem is shown to converge to the value function of the limiting problem, and the convergence rate is obtained based on some a priori estimates of the asymptotic behavior of the Markov chains. As a result, an ...

39 citations


Journal ArticleDOI
Hang Zhu1
TL;DR: In this article, a singular stochastic control for non-degenerate problems is studied, where the model equation is nonlinear and the cost function need not be convex.
Abstract: This paper is concerned with singular stochastic control for non-degenerate problems. It generalizes the previous work in that the model equation is nonlinear and the cost function need not be convex. The associated dynamic programming equation takes the form of variational inequalities. By combining the principle of dynamic programming and the method of penalization, we show that the value function is characterized as a unique generalized (Sobolev) solution which satisfies the dynamic programming variational inequality in the almost everywhere sense. The approximation for our singular control problem is given in terms of a family of penalized control problems. As a result of such a penalization, we obtain that the value function is also the minimum cost available when only the admissible pairs with uniformly Lipschitz controls are admitted in our cost criterion.

38 citations


Journal ArticleDOI
TL;DR: In this paper, a two-domain decomposition for the Hamilton-Jacobi-Bellman equation in R n is presented, where the original problem is splitted into two problems with state constraints plus a linking condition.

Journal ArticleDOI
TL;DR: In this paper, the authors considered the smooth fit for a class of one-dimensional singular stochastic control problems allowing the system to be of nonlinear diffusion type and proved the existence and the uniqueness of a convex $C^2 $-solution to the corresponding variational inequality.
Abstract: This paper considers the principle of smooth fit for a class of one-dimensional singular stochastic control problems allowing the system to be of nonlinear diffusion type The existence and the uniqueness of a convex $C^2 $-solution to the corresponding variational inequality are obtained It is proved that this solution gives the value function of the control problem, and the optimal control process is constructed As an example of the degenerate case, it is proved that the conclusion is also true for linear systems, and the explicit formula for the smooth fit points is derived

Journal ArticleDOI
TL;DR: In this article, the authors studied the Bellman equation of ergodic control with the form (1.1) Lu + H(x, Vu} + λ = q(x), χεΜ\" where L is a 2 order differential operator Lu = D^^D^u and H is a non linear function of V u, called the Hamiltonian.
Abstract: We are interested in the study of equations of the form (1.1) Lu + H(x, Vu} + λ = q(x), χεΜ\" where L is a 2 order differential operator Lu = — D^^D^u and H is a non linear function of V u, called the Hamiltonian. Now in (1.1) λ is constant. One should view (u, λ) s a pair of unknowns and observe that u is defined up to an additive constant. Equation (1.1) is called the Bellman equation of ergodic control. It arises naturally in the context of control theory, s the limit of the following problem (1.2) Lua + H(x^ua) + ocua = q s α tends to 0. Consider s a model problem the case (1.3) -Δηα + \\?ΐ4Λ\\ which can be solved explicitly by the formula

Journal ArticleDOI
TL;DR: In this article, it was shown that a generalized Bellman-Hamilton-Jacobi (BHJ) equation involving the Clarke generalized gradient is a necessary and sufficient optimality condition for piecewise deterministic Markov processes.
Abstract: Piecewise deterministic Markov processes (PDPs) are continuous time homogeneous Markov processes whose trajectories are solutions of ordinary differential equations with random jumps between the different integral curves. Both continuous deterministic motion and the random jumps of the processes are controlled in order to minimize the expected value of a performance criterion involving discounted running and boundary costs. Under fairly general assumptions, we will show that there exists an optimal control, that the value function is Lipschitz continuous and that a generalized Bellman-Hamilton-Jacobi (BHJ) equation involving the Clarke generalized gradient is a necessary and sufficient optimality condition for the problem.

Proceedings ArticleDOI
16 Dec 1992
TL;DR: Theoretical procedures for comparing the performance of arbitrarily selected admissible feedback controls among themselves with that of the optimal solution of a nonlinear optimal stochastic control problem are developed in this paper.
Abstract: Theoretical procedures are developed for comparing the performance of arbitrarily selected admissible feedback controls among themselves with that of the optimal solution of a nonlinear optimal stochastic control problem. Iterative design schemes are proposed for successively improving the performance of a controller until a satisfactory design is achieved. Specifically, the exact design procedure is based on the generalized Hamilton-Jacobi-Bellman equation for the value function of nonlinear stochastic systems, and the approximate design procedure for the nonlinear stochastic regular problem with an infinite horizon is developed by using the upper and lower bounds to the value functions. For a given controller, both the upper and lower bounds to its value function can be obtained by solving a partial differential inequality. In particular, the upper and lower bounds to the optimal value function, which may be used as a measure to evaluate the acceptability of suboptimal controllers, can be constructed without actually knowing the optimal controller. >

Journal ArticleDOI
TL;DR: In this paper, it was shown that the minimal time function for time optimal control problems governed by evolution equations is a (generalized) viscosity solution for the Bellman equation (resp. the dynamic programming equation).
Abstract: In a previous paper the author has introduced a new notion of a (generalized) viscosity solution for Hamilton-Jacobi equations with an unbounded nonlinear term. It is proved here that the minimal time function (resp. the optimal value function) for time optimal control problems (resp. optimal control problems) governed by evolution equations is a (generalized) viscosity solution for the Bellman equation (resp. the dynamic programming equation). It is also proved that the Neumann problem in convex domains may be viewed as a Hamilton-Jacobi equation with a suitable unbounded nonlinear term.

Journal ArticleDOI
TL;DR: In this article, the authors provided sensitivity and duality results for continuous-time optimal capital accumulation models where preferences belong to a class of recursive objectives, and they combined the same topology with a controllability condition to demonstrate the sensitivity of the optimal path with respect to changes in the initial endowment of capital.
Abstract: Summary. This paper provides sensitivity and duality results for continuous-time optimal capital accumulation models where preferences belong to a class of recursive objectives. We combine the topology used by Becker, Boyd and Sung (1989) with a controllability condition to demonstrate that optimal paths are continuous with respect to changes in both the initial capital stock, and the rate of time preference. Under convexity and an interiority condition, we nd the value function is dierentiable, and derive a multiplier equation for the supporting prices. Finally, under some mild additional conditions, we show that supporting prices obeying the transversality and multiplier equations are both necessary and sucient for an optimum. This paper provides sensitivity and duality results for continuous-time optimal capital accumulation models where the planner’s preferences are represented by a recursive objective functional. Time preference is exible. A previous paper (Becker, Boyd and Sung, 1989) established the existence of optimal paths by choice of an appropriate topology. In this paper, we combine the same topology with a controllability condition to demonstrate the sensitivity of the optimal path with respect to changes in the initial endowment of capital, and changes in the planner’s rate of time preference. Under convexity and an interiority condition, we nd the value function is dierentiable, and derive a multiplier equation for the

Journal ArticleDOI
TL;DR: In this paper, it was shown that the true value function exists, and that it is characterized as the unique admissible solution to Bellman's equation, and the limit of successive approximations.


Proceedings ArticleDOI
16 Dec 1992
TL;DR: In this paper, the authors focus on structured solutions to stochastic control models, which are models for which value functions and/or optimal policies have some special dependence on the (initial) state.
Abstract: The solution of the infinite horizon stochastic control problem under certain criteria, the functional characterization and computation of optimal values and policies, is related to two dynamic programming-like functional equations: the discounted cost optimality equation (DCOE) and the average cost optimality equation (ACOE). The authors consider what useful properties, shared by large and important problem classes, can be used to show that an ACOE holds, and how these properties can be exploited to aid in the development of tractable algorithmic solutions. They address this issue by concentrating on structured solutions to stochastic control models. By a structured solution is meant a model for which value functions and/or optimal policies have some special dependence on the (initial) state. The focus is on convexity properties of the value function. >

Journal ArticleDOI
TL;DR: In this paper, the authors consider the existence and uniqueness of viscosity solutions to a system of integro-differential equations with bilateral implicit obstacles, where the underlying state processes are piecewise-deterministic processes.
Abstract: We consider the existence and uniqueness of viscosity solutions to a system of integro-differential equations with bilateral implicit obstacles. The system is the dynamic programming equation associated with a switching game for two players, in which the underlying state processes are piecewise-deterministic processes. We prove the probabilistic representation of the viscosity solution as the saddle point of the game for the Rn case

Journal ArticleDOI
TL;DR: In this paper, the authors considered the interaction of warfarin and phenylbutazone as a time-optimal control problem with state inequality constraints and showed that necessary optimality conditions and junction conditions for bounded state variables lead to boundary value problems with switching and junction condition.
Abstract: The interaction of the two drugs warfarin and phenylbutazone has previously been considered as a time-optimal control problem with state inequality constraints. We include bounds for the control and show that necessary optimality conditions and junction conditions for bounded state variables lead to boundary value problems with switching and junction conditions. A special version of the multiple-shooting algorithm is employed for solving the different types of boundary value problems. The switching structure of the optimal control is determined for a range of parameters in the state constraint. Owing to the special structure of the control, a state space solution is obtained in a first step which reduces the numerical complexity of the problem. It is shown how the numerical results can be used to compute the generalized gradient of the optimal value function explicitly.


Book ChapterDOI
01 Jan 1992
TL;DR: In this paper, the authors studied the Bolza problem arising in nonlinear optimal control and investigated under what circumstances the necessary conditions for optimality of Pontryagin's type are also sufficient.
Abstract: We study the Bolza problem arising in nonlinear optimal control and investigate under what circumstances the necessary conditions for optimality of Pontryagin’s type are also sufficient. This leads to the question when shocks do not occur in the method of characteristics applied to the associated Hamilton-Jacobi-Bellman equation. In this case the value function is its (unique) continuously differentiable solution and can be obtained from the canonical equations. In optimal control this corresponds to the case when the optimal trajectory of the Bolza problem is unique for every initial state and the optimal feedback is an upper semicontinuous set-valued map with convex, compact images.

Journal ArticleDOI
TL;DR: A stochastic control problem arising from the optimal design of accelerated life tests is formulated and it is proved that the solution is of bang-bang type and that the control problem is equivalent to an optimal stopping problem.
Abstract: We formulate a stochastic control problem arising from the optimal design of accelerated life tests. The model is obtained, in a natural way, in the setup of point processes. The cost functional depends on the total number of items under test at time t, the observed number of items failed up to time t, and the stress level applied to the items under test at time t. The optimal policy minimizing the cost functional is characterized via the dynamic programming equation. After a further specification of the model, we prove that the solution is of bang-bang type. In some specific cases, we also prove that the control problem is equivalent to an optimal stopping problem. This amounts to simply running the test, with the number of items and the stress level at their highest available value, up to the exit time of the state process from a given region. In this special case we exhibit an explicit solution and give a criterion in order to decide whether to let the experiment run or to stop it.


Journal ArticleDOI
TL;DR: In this paper, the optimal value function in nonlinear programming is decomposed into subgradients of the value function with respect to nonlinear programs, and the subgradient function is used for nonlinear program optimization.
Abstract: (1992). b-Subgradients of the optimal value function in nonlinear programming. Optimization: Vol. 26, No. 3-4, pp. 153-163.

01 Jan 1992
TL;DR: Application from machine scheduling, production control and fire egress, and Multi-Objective Dynamic Programming in the case of constant costs, is presented.
Abstract: The applicability of shortest-path algorithms is well known. There have been two separate extensions of the classical shortest-path problem. The first is an extension to finding optimal paths in networks whose links have time-dependent costs, while the second is concerned with the establishment of algorithms to find all non-dominated paths through a network with vector-valued costs. Recent work has shown that the adequate analysis of some important problems, characterized by multiple-objectives in a time-dependent context, requires the formulation of models and algorithms that are capable of capturing both of these advances. An algorithm that solves the following problem is established. Let a network whose links have vector-valued, time-dependent costs be given. Suppose a distinguished node, called the destination node is selected. Find all non-dominated paths from all nodes to the destination node. Apart from satisfying a certain boundedness condition, the cost functions are not restricted. The algorithm is recursive, finite, and constructive. It is a backward and forward procedure, operating simultaneously in the set of links and in the set of paths. No time grid is required. It reduces to Multi-Objective Dynamic Programming in the case of constant costs. In general however, this reduction does not hold, as the Principle of Optimality may be violated. Applications from machine scheduling, production control and fire egress are presented.

Journal ArticleDOI
TL;DR: In this article, a method of synthesis for a wide class of nonlinear systems affine in control is considered, based on the solution of a special optimal control problem, and the integrand of the optimized functional is chosen in such a way that the Bellman equation has a desired solution.
Abstract: The method of synthesis for a wide class of nonlinear systems affine in control is considered. The proposed approach is based on the solution of a special optimal control problem. The integrand of the optimized functional is chosen in such a way that the Bellman equation has a desired solution. The nonlinear system design is reduced to the examination of the integrand of the optimized functional. To extend the domain of asymptotic stability of the nonlinear system, a sequence of the Lyapunov functions is used. The whole system becomes a system with variable structure

Book ChapterDOI
01 Jan 1992
TL;DR: In this paper, the authors considered a finite horizon optimal control problem in Mayer form for a system governed by a semilinear state equation and proved that the associated value function is differentiable along optimal trajectories.
Abstract: We consider a finite horizon optimal control problem in Mayer form for a system governed by a semilinear state equation. We prove that, under suitable assumptions, the associated value function is differentiable along optimal trajectories. For this purpose we prove a backward uniqueness result for a class of abstract evolution equation of parabolic type.