Topic

Bellman equation

About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems

[...]

Qinglai Wei¹, Derong Liu², Hanquan Lin¹•Institutions (2)

Chinese Academy of Sciences¹, University of Science and Technology Beijing²

01 Mar 2016-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: In this paper, for the first time, the admissibility properties of the iterative control laws are developed for value iteration algorithms and it is emphasized that new termination criteria are established to guarantee the effectiveness of the iteration control laws.

...read moreread less

Abstract: In this paper, a value iteration adaptive dynamic programming (ADP) algorithm is developed to solve infinite horizon undiscounted optimal control problems for discrete-time nonlinear systems. The present value iteration ADP algorithm permits an arbitrary positive semi-definite function to initialize the algorithm. A novel convergence analysis is developed to guarantee that the iterative value function converges to the optimal performance index function. Initialized by different initial functions, it is proven that the iterative value function will be monotonically nonincreasing, monotonically nondecreasing, or nonmonotonic and will converge to the optimum. In this paper, for the first time, the admissibility properties of the iterative control laws are developed for value iteration algorithms. It is emphasized that new termination criteria are established to guarantee the effectiveness of the iterative control laws. Neural networks are used to approximate the iterative value function and compute the iterative control law, respectively, for facilitating the implementation of the iterative ADP algorithm. Finally, two simulation examples are given to illustrate the performance of the present method.

...read moreread less

324 citations

Journal Article•DOI•

On minimizing the ruin probability by investment and reinsurance

[...]

Hanspeter Schmidli

01 Aug 2002-Annals of Applied Probability

TL;DR: In this article, the authors consider a classical risk model and allow investment into a risky asset modelled as a Black-Scholes model as well as (proportional) reinsurance.

...read moreread less

Abstract: We consider a classical risk model and allow investment into a risky asset modelled as a Black--Scholes model as well as (proportional) reinsurance. Via the Hamilton--Jacobi--Bellman approach we find a candidate for the optimal strategy and develop a numerical procedure to solve the HJB equation. We prove a verification theorem in order to show that any increasing solution to the HJB equation is bounded and solves the optimisation problem. We prove that an increasing solution to the HJB equation exists. Finally two numerical examples are discussed.

...read moreread less

321 citations

Journal Article•DOI•

Dynamic programming and influence diagrams

[...]

J.A. Tatman¹, Ross D. Shachter•Institutions (1)

Air Force Institute of Technology¹

01 Mar 1990

TL;DR: By representing value function separability in the structure of the graph of the influence diagram, formulation is simplified and operations on the model can take advantage of the separability, this allows simple exploitation in the value function of a decision problem.

...read moreread less

Abstract: The concept of a super value node is developed to extend the theory of influence diagrams to allow dynamic programming to be performed within this graphical modeling framework. The operations necessary to exploit the presence of these nodes and efficiently analyze the models are developed. The key result is that by representing value function separability in the structure of the graph of the influence diagram, formulation is simplified and operations on the model can take advantage of the separability. From the decision analysis perspective, this allows simple exploitation of separability in the value function of a decision problem. This allows algorithms to be designed to solve influence diagrams that automatically recognize the opportunity for applying dynamic programming. From the decision processes perspective, influence diagrams with super value nodes allow efficient formulation and solution of nonstandard decision process structures. They also allow the exploitation of conditional independence between state variables. >

...read moreread less

320 citations

Journal Article•DOI•

Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning

[...]

Hamidreza Modares¹, Frank L. Lewis¹•Institutions (1)

University of Texas at Austin¹

11 Apr 2014-IEEE Transactions on Automatic Control

TL;DR: An online learning algorithm is developed to solve the linear quadratic tracking (LQT) problem for partially-unknown continuous-time systems and it is shown that the value function is Quadratic in terms of the state of the system and the command generator.

...read moreread less

Abstract: In this technical note, an online learning algorithm is developed to solve the linear quadratic tracking (LQT) problem for partially-unknown continuous-time systems. It is shown that the value function is quadratic in terms of the state of the system and the command generator. Based on this quadratic form, an LQT Bellman equation and an LQT algebraic Riccati equation (ARE) are derived to solve the LQT problem. The integral reinforcement learning technique is used to find the solution to the LQT ARE online and without requiring the knowledge of the system drift dynamics or the command generator dynamics. The convergence of the proposed online algorithm to the optimal control solution is verified. To show the efficiency of the proposed approach, a simulation example is provided.

...read moreread less

320 citations

Journal Article•DOI•

A General Theory of Markovian Time Inconsistent Stochastic Control Problems

[...]

Agatha Murgoci¹, Tomas Björk²•Institutions (2)

Copenhagen Business School¹, Stockholm School of Economics²

17 Sep 2010-Social Science Research Network

TL;DR: In this article, the authors develop a theory for stochastic control problems which are time inconsistent in the sense that they do not admit a Bellman optimality principle and attach these problems by viewing them within a game theoretic framework, and look for Nash subgame perfect equilibrium points.

...read moreread less

Abstract: We develop a theory for stochastic control problems which, in various ways, are time inconsistent in the sense that they do not admit a Bellman optimality principle. We attach these problems by viewing them within a game theoretic framework, and we look for Nash subgame perfect equilibrium points. For a general controlled Markov process and a fairly general objective functional we derive an extension of the standard Hamilton-Jacobi-Bellman equation, in the form of a system of non-linear equations, for the determination for the equilibrium strategy as well as the equilibrium value function. All known examples of time inconsistency in the literature are easily seen to be special cases of the present theory. We also prove that for every time inconsistent problem, there exists an associated time consistent problem such that the optimal control and the optimal value function for the consistent problem coincides with the equilibrium control and value function respectively for the time inconsistent problem. We also study some concrete examples.

...read moreread less

315 citations

Collapse

Network Information

Performance

Metrics

6,698

Papers

155,793

Citations

No. of papers in the topic in previous years
Year	Papers
2023	261
2022	537
2021	369
2020	411
2019	348
2018	353

Bellman equation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics