Topic
Bellman equation
About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.
Papers published on a yearly basis
Papers
More filters
••
TL;DR: In this article, the accuracy of two versions of Kydland and Prescott's (1980, 1982) procedure for approximating optimal decision rules in problems in which the objective fails to be quadratic and the constraints fail to be linear.
Abstract: This article studies the accuracy of two versions of Kydland and Prescott's (1980, 1982) procedure for approximating optimal decision rules in problems in which the objective fails to be quadratic and the constraints fail to be linear. The analysis is carried out using a version of the Brock–Mirman (1972) model of optimal economic growth. Although the model is not linear quadratic, its solution can, nevertheless, be computed with arbitrary accuracy using a variant of existing value-function iteration procedures. I find that the Kydland–Prescott approximate decision rules are very similar to those implied by value-function iteration.
75 citations
••
TL;DR: In this paper, an optimal investment-reinsurance-investment problem is considered for an insurer whose surplus process follows a jump-diffusion model, where the insurer transfers part of the risk due to insurance claims via a proportional reinsurance and invests the surplus in a simplified financial market consisting of a risk-free asset and a risky asset.
Abstract: We consider an optimal reinsurance-investment problem of an insurer whose surplus process follows a jump-diffusion model. In our model the insurer transfers part of the risk due to insurance claims via a proportional reinsurance and invests the surplus in a “simplified” financial market consisting of a risk-free asset and a risky asset. The dynamics of the risky asset are governed by a constant elasticity of variance model to incorporate conditional heteroscedasticity. The objective of the insurer is to choose an optimal reinsurance-investment strategy so as to maximize the expected exponential utility of terminal wealth. We investigate the problem using the Hamilton-Jacobi-Bellman dynamic programming approach. Explicit forms for the optimal reinsuranceinvestment strategy and the corresponding value function are obtained. Numerical examples are provided to illustrate how the optimal investment-reinsurance policy changes when the model parameters vary.
75 citations
••
TL;DR: An asymptotic analysis of a hierarchical manufacturing system with machines subject to breakdown and repair is presented, finding that the control for the original problem can be constructed from the optimal controls of the limiting problem in a way which guarantees asymPTotic optimality of the value function.
Abstract: This paper presents an asymptotic analysis of a hierarchical manufacturing system with machines subject to breakdown and repair. The rate of change in machine states is much larger than the rate of fluctuation in demand and the rate of discounting of costs, and this gives rise to a limiting problem in which the stochastic machine availability is replaced by the equilibrium mean availability. The value function for the original problem converges to the value function of the limiting problem. Moreover, the control for the original problem can be constructed from the optimal controls of the limiting problem in a way which guarantees asymptotic optimality of the value function. The limiting problem is computationally more tractable and sometimes has a closed form solution.
74 citations
••
TL;DR: A methodology to design a dynamic controller to achieve L2-disturbance attenuation or approximate optimality, with asymptotic stability is introduced.
Abstract: The solution of most nonlinear control problems hinges upon the solvability of partial differential equations or inequalities. In particular, disturbance attenuation and optimal control problems for nonlinear systems are generally solved exploiting the solution of the so-called Hamilton-Jacobi (HJ) inequality and the Hamilton-Jacobi-Bellman (HJB) equation, respectively. An explicit closed-form solution of this inequality, or equation, may however be hard or impossible to find in practical situations. Herein we introduce a methodology to circumvent this issue for input-affine nonlinear systems proposing a dynamic, i.e., time-varying, approximate solution of the HJ inequality and of the HJB equation the construction of which does not require solving any partial differential equation or inequality. This is achieved considering the immersion of the underlying nonlinear system into an augmented system defined on an extended state-space in which a (locally) positive definite storage function, or value function, can be explicitly constructed. The result is a methodology to design a dynamic controller to achieve L2-disturbance attenuation or approximate optimality, with asymptotic stability.
74 citations
••
TL;DR: This survey reviews state-of-the-art methods for (parametric) value function approximation by grouping them into three main categories: bootstrapping, residual, and projected fixed-point approaches.
Abstract: Reinforcement learning (RL) is a machine learning answer to the optimal control problem. It consists of learning an optimal control policy through interactions with the system to be controlled, the quality of this policy being quantified by the so-called value function. A recurrent subtopic of RL concerns computing an approximation of this value function when the system is too large for an exact representation. This survey reviews state-of-the-art methods for (parametric) value function approximation by grouping them into three main categories: bootstrapping, residual, and projected fixed-point approaches. Related algorithms are derived by considering one of the associated cost functions and a specific minimization method, generally a stochastic gradient descent or a recursive least-squares approach.
74 citations