Showing papers on "Bellman equation published in 1986"

PDF

Open Access

Journal Article•DOI•

Optimal control of production rate in a failure prone manufacturing system

[...]

Ram Akella¹, P. R. Kumar²•Institutions (2)

Massachusetts Institute of Technology¹, University of Illinois at Urbana–Champaign²

01 Feb 1986-IEEE Transactions on Automatic Control

TL;DR: A manufacturing system can be in one of two states: functional and failed, and it moves back and forth between these two states as a continuous time Markov chain, with mean time between failures = 1/ q1, and mean time to repair 1/q2.

...read moreread less

Abstract: We address the problem of controlling the production rate of a failure prone manufacturing system so as to minimize the discounted inventory, cost, where certain cost rates are specified for both positive and negative inventories, and there is a constant demand rate for the commodity produced. The underlying theoretical problem is the optimal control of a continuous-time system with jump Markov disturbances, with an infinite horizon discounted cost criterion. We use two complementary approaches. First, proceeding informally, and using a combination of stochastic coupling, linear system arguments, stable and unstable eigenspaces, renewal theory, parametric optimization, etc., we arrive at a conjecture for the optimal policy. Then we address the previously ignored mathematical difficulties associated with differential equations with discontinuous right-hand sides, singularity of the optimal control problem, smoothness, and validity of the dynamic programming equation, etc., to give a rigorous proof of optimality of the conjectured policy. It is hoped that both approaches will find uses in other such problems also. We obtain the complete solution and show that the optimal solution is simply characterized by a certain critical number, which we call the optimal inventory level. If the current inventory level exceeds the optimal, one should not produce at all; if less, one should produce at the maximum rate; while if exactly equal, one should produce exactly enough to meet demand. We also give a simple explicit formula for the optimal inventory level.

...read moreread less

643 citations

Journal Article•DOI•

Time-average optimal constrained semi-Markov decision processes

[...]

Frederick J. Beutler, Keith W. Ross

01 Jun 1986-Advances in Applied Probability

TL;DR: In this paper, a Lagrange multiplier formulation involving a dynamic programming equation is utilized to relate the constrained optimization to an unconstrained optimization parametrized by the multiplier, leading to a proof for the existence of a semi-simple optimal constrained policy.

...read moreread less

Abstract: Optimal causal policies maximizing the time-average reward over a semiMarkov decision process (SMDP), subject to a hard constraint on a timeaverage cost, are considered. Rewards and costs depend on the state and action, and contain running as well as switching components. It is supposed that the state space of the SMDP is finite, and the action space compact metric. The policy determines an action at each transition point of the SMDP. Under an accessibility hypothesis, several notions of time average are equivalent. A Lagrange multiplier formulation involving a dynamic programming equation is utilized to relate the constrained optimization to an unconstrained optimization parametrized by the multiplier. This approach leads to a proof for the existence of a semi-simple optimal constrained policy. That is, there is at most one state for which the action is randomized between two possibilities; at all other states, an action is uniquely chosen for each state. Affine forms for the rewards, costs and transition probabilities further reduce the optimal constrained policy to 'almost bang-bang' form, in which the optimal policy is not randomized, and is bang-bang except perhaps at one state. Under the same assumptions, one can alternatively find an optimal constrained policy that is strictly bang-bang, but may be randomized at one state. Application is made to flow control of a birth-and-death process (e.g., an MIMIs queue); under certain monotonicity restrictions on the reward and cost structure the preceding results apply, and in addition there is a simple acceptance region.

...read moreread less

84 citations

Journal Article•DOI•

Approximation and bounds in discrete event dynamic programming

[...]

Alain Haurie¹, Pierre L'Ecuyer•Institutions (1)

École Normale Supérieure¹

01 Mar 1986-IEEE Transactions on Automatic Control

TL;DR: In this article, a general dynamic programming algorithm for the solution of optimal stochastic control problems concerning a class of discrete event systems is presented, where the emphasis is put on the numerical technique used for the approximation of the dynamic programming equation.

...read moreread less

Abstract: This paper presents a general dynamic programming algorithm for the solution of optimal stochastic control problems concerning a class of discrete event systems. The emphasis is put on the numerical technique used for the approximation of the solution of the dynamic programming equation. This approach can be efficiently used for the solution of optimal control problems concerning Markov renewal processes. This is illustrated on a group preventive replacement model generalizing an earlier work of the authors.

...read moreread less

64 citations

Journal Article•DOI•

A new look at Bellman's principle of optimality

[...]

M. Sniedovich¹•Institutions (1)

Research Institute for Mathematical Sciences¹

01 Apr 1986-Journal of Optimization Theory and Applications

TL;DR: It is argued that a failure to recognize the special features of the model in the context of which the principle was stated has resulted in the latter being misconstrued in the dynamic programming literature.

...read moreread less

Abstract: New light is shed on Bellman's principle of optimality and the role it plays in Bellman's conception of dynamic programming. It is argued that a failure to recognize the special features of the model in the context of which the principle was stated has resulted in the latter being misconstrued in the dynamic programming literature.

...read moreread less

56 citations

Journal Article•DOI•

Multi-grid methods for Hamilton-Jacobi-Bellman equations

[...]

Ronald H. W. Hoppe¹•Institutions (1)

Technical University of Berlin¹

15 Jul 1986-Numerische Mathematik

TL;DR: In this article, multi-grid algorithms for the numerical solution of Hamilton-Jacobi-Bellman equations were developed for numerical solutions of Hamilton and Jacobi Bellman equations using a combination of standard multigrid techniques and the iterative methods used by Lions and mercier in [11].

...read moreread less

Abstract: In this paper we develop multi-grid algorithms for the numerical solution of Hamilton-Jacobi-Bellman equations The proposed schemes result from a combination of standard multi-grid techniques and the iterative methods used by Lions and mercier in [11] A convergence result is given and the efficiency of the algorithms is illustrated by some numerical examples

...read moreread less

51 citations

Journal Article•DOI•

The value function in optimal control: Sensitivity, controllability, and time-optimality

[...]

Frank H. Clarke, Philip D. Loewen

01 Mar 1986-Siam Journal on Control and Optimization

TL;DR: In this article, the authors considered a general optimal control problem in which the constraints depend on a parameter and the resulting value function, and a formula for the generalized gradient of V was proven and then used to obtain results on stability and controllability of the problem.

...read moreread less

Abstract: We consider a general optimal control problem in which the constraints depend on a parameter $\alpha $, and the resulting value function $V(\alpha )$. A formula for the generalized gradient of V is proven and then used to obtain results on stability and controllability of the problem. A special study is made of the time-optimal control problem, one consequence of which is a new criterion assuring local null-controllability of the system and continuity of the minimal time function at the origin.

...read moreread less

41 citations

Journal Article•DOI•

Dual Control of an Integrator with Unknown Gain

[...]

Karl Johan Åström¹, Anders Helmersson¹•Institutions (1)

Lund University¹

01 Jun 1986-Computers & Mathematics With Applications

TL;DR: The dual control law for an integrator with constant but unknown gain is computed in this paper, and a representation which makes it easy to compare dual control with certainty equivalence and cautious control is also introduced.

...read moreread less

Abstract: The dual control law for an integrator with constant but unknown gain is computed Numerical problems associated with the solution of the Bellman equation are reviewed Properties of the dual control law are discussed A representation which makes it easy to compare dual control with certainty equivalence and cautious control is also introduced

...read moreread less

40 citations

Journal Article•DOI•

Finite-state approximations for denumerable state discounted markov decision processes

[...]

Rolando Cavazos-Cadena

01 Apr 1986-Applied Mathematics and Optimization

TL;DR: Convergence theorems that, when applied to the case of bounded rewards, give stronger results than those in [9] are proved and bounds on the rates of convergence under several assumptions are given.

...read moreread less

Abstract: A finite-state iterative scheme introduced by White [9] to approximate the optimal value function of denumerable-state Markov decision processes with bounded rewards, is extended to the case of unbounded rewards. Convergence theorems that, when applied to the case of bounded rewards, give stronger results than those in [9] are proved. Moreover, bounds on the rates of convergence under several assumptions are given and the extended scheme is used to obtain policies with asymptotic optimality properties.

...read moreread less

37 citations

Journal Article•DOI•

Optimal harvesting of a logistic population in an environment with stochastic jumps.

[...]

Dennis Ryan¹, Floyd B. Hanson²•Institutions (2)

Wright State University¹, University of Illinois at Chicago²

01 Jan 1986-Journal of Mathematical Biology

TL;DR: The optimal results are most strongly sensitive to the rate of stochastic jumps and to the quadratic cost factor to a lesser extent when the deterministic bioeconomic parameters are taken from aggregate antarctic pelagic whaling data.

...read moreread less

Abstract: Dynamic programming is employed to examine the effects of large, sudden changes in population size on the optimal harvest strategy of an exploited resource population. These changes are either adverse or favorable and are assumed to occur at times of events of a Poisson process. The amplitude of these jumps is assumed to be density independent. In between the jumps the population is assumed to grow logistically. The Bellman equation for the optimal discounted present value is solved numerically and the optimal feedback control computed for the random jump model. The results are compared to the corresponding results for the quasi-deterministic approximation. In addition, the sensitivity of the results to the discount rate, the total jump rate and the quadratic cost factor is investigated. The optimal results are most strongly sensitive to the rate of stochastic jumps and to the quadratic cost factor to a lesser extent when the deterministic bioeconomic parameters are taken from aggregate antarctic pelagic whaling data.

...read moreread less

32 citations

Journal Article•DOI•

Absolutely continuous and singular stochastic control

[...]

John P. Lehoczky, Steven E. Shreve

01 Apr 1986-Stochastics An International Journal of Probability and Stochastic Processes

TL;DR: In this article, a stochastic control problem similar to the one dimensional linear-quadratic-Gaussian problem but with an asymptotically linear cost for control is studied.

...read moreread less

Abstract: A stochastic control problem similar to the one dimensional linear-quadratic-Gaussian problem but with an asymptotically linear cost for control is studied. The value function is characterized, and it is shown that the optimal control process has both absolutely continuous and singular components. A discussion of the fact that the value function is C 2 is given, and an example of a singular control problem in which the value function is not C 2 is presented.

...read moreread less

31 citations

Journal Article•DOI•

Perturbed optimal control problems

[...]

Frank H. Clarke¹•Institutions (1)

Université de Montréal¹

01 Jun 1986-IEEE Transactions on Automatic Control

TL;DR: In this article, a family of problems obtained by perturbing infinite-dimensionally the dynamics of an optimal control problem is examined, and a formula is derived for the generalized gradient of the associated value function, one which specializes to yield, for instance, information about ordinary directional derivatives.

...read moreread less

Abstract: We examine a family of problems obtained by perturbing infinite-dimensionally the dynamics of an optimal control problem. A formula is derived for the generalized gradient of the associated value function, one which specializes to yield, for instance, information about ordinary directional derivatives. Several examples are discussed.

...read moreread less

Journal Article•DOI•

On iterated minimization in nonconvex optimization

[...]

H. Th. Jongen, T. Möbert, K. Tammer

01 Nov 1986-Mathematics of Operations Research

TL;DR: It turns out that strong stability in the sense of Kojima in the first phase is a natural assumption for the iterated local minima of the parametric problem and a generalized version of a positive definiteness criterion of Fujiwara-Han-Mangasarian is used.

...read moreread less

Abstract: In dynamic programming and decomposition methods one often applies an iterated minimization procedure. The problem variables are partitioned into several blocks, say x and y. Treating y as a parameter, the first phase consists of minimization with respect to the variable x. In a second phase the minimization of the resulting optimal value function depending on y is considered. In this paper we treat this basic idea on a local level. It turns out that strong stability in the sense of Kojima in the first phase is a natural assumption. In order to show that the iterated local minima of the parametric problem lead to a local minimum for the whole problem, we use a generalized version of a positive definiteness criterion of Fujiwara-Han-Mangasarian.

...read moreread less

Journal Article•DOI•

Ruin problems and myopic portfolio optimization in continuous trading

[...]

Knut K. Aase¹•Institutions (1)

Norwegian School of Economics¹

01 Feb 1986-Stochastic Processes and their Applications

TL;DR: In this paper, a test criterion for bankruptcy is developed, and a portfolio optimization problem is investigated and solved using Doleans-Dade's exponential formula, and the optimality criterion used is to maximize the expected rate of growth.

...read moreread less

Book Chapter•DOI•

Well-Posedness and Stability Analysis in Optimization

[...]

Tullio Zolezzi

01 Jan 1986-North-holland Mathematics Studies

TL;DR: In this article, the authors synthesize some results about well-posedness and stability analysis in abstract minimum problems and optimal control of ordinary differential inclusions, based on the variational convergence.

...read moreread less

Abstract: Publisher Summary This chapter synthesizes some results about well-posedness and stability analysis in abstract minimum problems and optimal control of ordinary differential inclusions. The chapter describes well-posedness of convex minimum problems in Banach spaces by using their optimal value functions. The chapter also deals with optimal control problems for differential inclusions. Neither convexity nor existence of optimal solutions is assumed. Further relates the continuous behavior of the optimal value function with respect to perturbations acting on the data, with its stable behavior when passing to the (unperturbated) relaxed problem. The approach behind these theorems is based on the variational convergence. The relations between variational convergence (including epi-convergence) and some results reported in the chapter are considered in with an eye on mathematical programming problems.

...read moreread less

Journal Article•DOI•

Policy Bounds for Markov Decision Processes

[...]

William S. Lovejoy¹•Institutions (1)

Georgia Institute of Technology¹

01 Jul 1986-Operations Research

TL;DR: This paper demonstrates how a Markov decision process MDP can be approximated to generate a policy bound, i.e., a function that bounds the optimal policy from below or from above for all states.

...read moreread less

Abstract: This paper demonstrates how a Markov decision process MDP can be approximated to generate a policy bound, i.e., a function that bounds the optimal policy from below or from above for all states. We present sufficient conditions for several computationally attractive approximations to generate rigorous policy bounds. These approximations include approximating the optimal value function, replacing the original MDP with a separable approximate MDP, and approximating a stochastic MDP with its deterministic counterpart. An example from the field of fisheries management demonstrates the practical applicability of the results.

...read moreread less

Journal Article•DOI•

The principle of optimality in the design of efficient algorithms

[...]

Paul Helman¹•Institutions (1)

University of New Mexico¹

01 Oct 1986-Journal of Mathematical Analysis and Applications

TL;DR: The influence of Richard Bellman is seen in algorithms throughout the computer science literature and in particular on his work on the area of computer science known as algorithm design and analysis is focused on.

...read moreread less

Book Chapter•DOI•

Explicit solution of a consumption/investment problem

[...]

Ioannis Karatzas, John P. Lehoczky, Suresh Sethi, Steven E. Shreve

01 Jan 1986

Proceedings Article•DOI•

State constraints in optimal control

[...]

Philip D. Loewen¹•Institutions (1)

Imperial College London¹

01 Dec 1986

TL;DR: In this paper, an optimal control problem on a given interval [0, T] whose trajectories must satisfy the state constraint g (t, x(t))? 0 a.e.

...read moreread less

Abstract: We consider an optimal control problem on a given interval [0, T] whose trajectories must satisfy the state constraint g (t, x(t)) ? 0 a.e. Infinite-dimensional perturbations of this constraint give rise to a value function V, whose epigraph is a closed set containing sensitivity information, controllability and penalization results, and even necessary conditions for optimality.

...read moreread less

Journal Article•DOI•

A successive approximation algorithm for stochastic control problems

[...]

Mou-Hsiung Chang¹, K. Krishna¹•Institutions (1)

University of Alabama in Huntsville¹

01 Feb 1986-Applied Mathematics and Computation

TL;DR: This algorithm, together with the existing numerical methods for parabolic or elliptic PDEs, provides numerical schemes for the solution of Bellman equations.

...read moreread less

Book Chapter•DOI•

Current results and issues in stochastic control

[...]

Alain Bensoussan¹•Institutions (1)

Paris Dauphine University¹

01 Jan 1986

Book Chapter•DOI•

Stochastic and deterministic control: Differential inequalities

[...]

N. N. Subbotina, A. I. Subbotin, V. E. Tret'jakov

01 Jan 1986

Journal Article•

A note on estimation in controlled diffusion processes

[...]

Vera Lánská

01 Jan 1986-Kybernetika

TL;DR: Let us consider a system ¥ that is a diffusion one but depends on a control process U that has state space equal to the real line ft and states of the control process, control parameters, range over a set of parameters.

...read moreread less

Abstract: Let us consider a system ¥. Its evolution is described by a stochastic process X = {X„ t ^ 0} with the state space equal to the real line ft. We assume that the process X is a diffusion one but depends on a control process U = {U„ t ^ 0}. The states of the control process, control parameters, range over a set % c ft\". To keep the presentation concise and simple we limit ourselves to the family of X given by the following stochastic differential equation

...read moreread less

Journal Article•DOI•

Fixed point theorems for discounted finite Markov decision processes

[...]

Ulrich Holzbaur¹•Institutions (1)

University of Ulm¹

01 Jun 1986-Journal of Mathematical Analysis and Applications

TL;DR: In this article, the existence of a solution to the optimality equation for discounted finite Markov decision processes by means of Birkhoff's fixed point theorem was established, and the proof yields the well-known linear programming formulation for the optimal value function.

...read moreread less

Journal Article•DOI•

Control of diffusion processes in R^d and Bellman equation with degeneration

[...]

Masatoshi Fujisaki

01 Jan 1986-Osaka Journal of Mathematics

TL;DR: In this paper, the existence of solutions generalisees for des equations differentielles non lineaires de type parabolique avec degenerescence is investigated, based on the result of Fleming (1964) and comme application.

...read moreread less

Abstract: On etend des resultats de Fleming (1964) et comme application, on montre l'existence des solutions generalisees pour des equations differentielles non lineaires de type parabolique avec degenerescence

...read moreread less

Journal Article•DOI•

Nash-equilibrium in stochastic differential games

[...]

Svetoslav D. Gaidov

01 Jun 1986-Computers & Mathematics With Applications

TL;DR: In this article, the optimal strategies are solutions of certain partial initial value problems analogous to the Bellman equation in the theory of dynamic programming in non-zero-sum games with and without a control dependent noise.

...read moreread less

Abstract: The paper deals with N -person nonzero-sum games in which the dynamics is described by Ito stochastic differential equations. Sufficient conditions are found guaranteeing the Nash-equilibrium for the strategies of the players. The optimal strategies are solutions of certain partial initial value problems analogous to the Bellman equation in the theory of dynamic programming. Linear-quadratic games with and without a control dependent noise are studied.

...read moreread less

Journal Article•DOI•

Sufficient optimality conditions for stratified control problems

[...]

Stefan Mirica: r

01 Jul 1986-Siam Journal on Control and Optimization

TL;DR: Sufficient optimality conditions of dynamic programming type avoiding the axioms of Boltyan-skii's regular synthesis for control problems that have weakly stratified Hamiltonians are proved as discussed by the authors.

...read moreread less

Abstract: Sufficient optimality conditions of dynamic programming type avoiding the axioms of Boltyan-skii’s “regular synthesis” for control problems that have weakly stratified Hamiltonians are proved An improved version of Boltyanskii’s “fundamental lemma” is applied to the “value function” defined as the minimum of the cost functional along the solutions of a Hamiltonian inclusion which plays the role of a “system of characteristics” for the Hamilton–Jacobi–Bellman equation of dynamic programming

...read moreread less

Journal Article•DOI•

On the sufficiency of the Hamilton-Jacobi-Bellman equation for optimality of the controls in a linear optimal-time problem

[...]

F Mignanego¹, G Pieri¹•Institutions (1)

University of Genoa¹

01 Jan 1986-Systems & Control Letters

TL;DR: In this paper, sufficient conditions for optimal control in a linear-autonomous optimal-time problem with Lipschitz-continuous cost functional were studied, and the conditions involved a generalized Hamilton-Jacobi-Bellman equation.

...read moreread less

Book Chapter•DOI•

Dynamical Systems and Mathematical Economics

[...]

Donald G. Saari¹•Institutions (1)

Northwestern University¹

01 Jan 1986

TL;DR: The modern mathematical economics literature is permeated with dynamics as discussed by the authors, and arguments based upon dynamics are advanced to justify various forms of equilibria; here we find issues such as the accessibility of pareto points or the comparison of different bargaining solution concepts.

...read moreread less

Abstract: The modern mathematical economics literature is permeated with dynamics. This starts with a simple tatonnement story of how prices adjust according to supply and demand, and it continues with the more sophisticated price adjustment models which involve speculation, etc. Dynamics arise from the Euler, or the Bellman equations, to define the optimal paths in growth models, as well as in other optimization problems. Arguments based upon dynamics are advanced to justify various forms of equilibria; here we find issues such as the accessibility of pareto points or the comparison of different bargaining solution concepts. In recent years, as manifested by several of the papers presented at this conference, dynamics has been used to explain non-stationary behavior such as business cycles.

...read moreread less

Proceedings Article•DOI•

On the robustness of optimal regulators for nonlinear discrete-time systems

[...]

Jose C. Geromel¹, J. Da Cruz²•Institutions (2)

State University of Campinas¹, National Institute for Space Research²

01 Dec 1986

TL;DR: In this article, the robustness of nonlinear discrete-time systems is analyzed based on the existence of a stationary solution of the dynamic programming equation (DPE), which provides directly a Lyapunov function associated to the closed-loop system.

...read moreread less

Abstract: In this paper the robustness of nonlinear discrete-time systems is analyzed. The nominal plant is supposed to be controlled by means of a feedback control law which is optimal with respect to some given criterion. The robustness of the closed-loop system is studied for two different classes of perturbations in the control law, which are called gain and additive nonlinear perturbations. The results are entirely based on the existence of a stationary solution of the dynamic programming equation (DPE), which provides directly a Lyapunov function associated to the closed-loop system. The convexity of that solution and the use of the Taylor formula appear to be the key to establish the robustness properties of the nominal plant. Two examples are solved in order to show an interesting fact: the existence of a compromise between the robustness of the system subjected to the two different classes of perturbations.

...read moreread less

Dissertation•

Control of diffusion processes in R[d] and Bellman equation with degeneration

[...]

正敏藤崎

01 Jan 1986