Topic

Bellman equation

About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A Dynamic Programming Procedure for Pricing American-Style Asian Options

[...]

Hatem Ben-Ameur¹, Michèle Breton¹, Pierre L'Ecuyer²•Institutions (2)

École Normale Supérieure¹, Université de Montréal²

01 May 2002-Management Science

TL;DR: A procedure for pricing American-style Asian options of the Bermudan flavor, based on dynamic programming combined with finite-element piecewise-polynomial approximation of the value function, is developed here.

...read moreread less

Abstract: Pricing European-style Asian options based on the arithmetic average, under the Black and Scholes model, involves estimating an integral (a mathematical expectation) for which no easily computable analytical solution is available. Pricing their American-style counterparts, which provide early exercise opportunities, poses the additional difficulty of solving a dynamic optimization problem to determine the optimal exercise strategy. A procedure for pricing American-style Asian options of the Bermudan flavor, based on dynamic programming combined with finite-element piecewise-polynomial approximation of the value function, is developed here. A convergence proof is provided. Numerical experiments illustrate the consistency and efficiency of the procedure. Theoretical properties of the value function and of the optimal exercise strategy are also established.

...read moreread less

43 citations

Posted Content•

Mean-field Markov decision processes with common noise and open-loop controls

[...]

Médéric Motte¹, Huyên Pham¹•Institutions (1)

Paris Diderot University¹

08 Sep 2021-arXiv: Optimization and Control

TL;DR: The correspondence between CMKV-MDP and a general lifted MDP on the space of probability measures is proved, and the dynamic programming Bellman fixed point equation satisfied by the value function is established.

...read moreread less

Abstract: We develop an exhaustive study of Markov decision process (MDP) under mean field interaction both on states and actions in the presence of common noise, and when optimization is performed over open-loop controls on infinite horizon. Such model, called CMKV-MDP for conditional McKean-Vlasov MDP, arises and is obtained here rigorously with a rate of convergence as the asymptotic problem of N-cooperative agents controlled by a social planner/influencer that observes the environment noises but not necessarily the individual states of the agents. We highlight the crucial role of relaxed controls and randomization hypothesis for this class of models with respect to classical MDP theory. We prove the correspondence between CMKV-MDP and a general lifted MDP on the space of probability measures, and establish the dynamic programming Bellman fixed point equation satisfied by the value function, as well as the existence of-optimal randomized feedback controls. The arguments of proof involve an original measurable optimal coupling for the Wasserstein distance. This provides a procedure for learning strategies in a large population of interacting collaborative agents. MSC Classification: 90C40, 49L20.

...read moreread less

43 citations

Proceedings Article•DOI•

Primal-Dual Algorithm for Distributed Reinforcement Learning: Distributed GTD

[...]

Donghwan Lee¹, Hyung-Jin Yoon¹, Naira Hovakimyan¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Dec 2018

TL;DR: A primal-dual distributed GTD algorithm is proposed and it is proved that it almost surely converges to a set of stationary points of the optimization problem.

...read moreread less

Abstract: The goal of this paper is to study a distributed version of the gradient temporal-difference (GTD) learning algorithm for multi-agent Markov decision processes (MDPs). The temporal-difference (TD) learning is a reinforcement learning (RL) algorithm that learns an infinite horizon discounted cost function (or value function) for a given fixed policy without the model knowledge. In the distributed RL case each agent receives local reward through local processing. Information exchange over sparse communication network allows the agents to learn the global value function corresponding to a global reward, which is a sum of local rewards. In this paper, the problem is converted into a constrained convex optimization problem with a consensus constraint. We then propose a primal-dual distributed GTD algorithm and prove that it almost surely converges to a set of stationary points of the optimization problem.

...read moreread less

43 citations

Journal Article•DOI•

High-dimensional stochastic optimal control using continuous tensor decompositions

[...]

Alex Gorodetsky¹, Sertac Karaman¹, Youssef M. Marzouk¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Feb 2018-The International Journal of Robotics Research

TL;DR: This work proposes novel dynamic programming algorithms that alleviate the curse of dimensionality in problems that exhibit certain low-rank structure, and demonstrates the algorithms running in real time on board a quadcopter during a flight experiment under motion capture.

...read moreread less

Abstract: Motion planning and control problems are embedded and essential in almost all robotics applications. These problems are often formulated as stochastic optimal control problems and solved using dynamic programming algorithms. Unfortunately, most existing algorithms that guarantee convergence to optimal solutions suffer from the curse of dimensionality: the run time of the algorithm grows exponentially with the dimension of the state space of the system. We propose novel dynamic programming algorithms that alleviate the curse of dimensionality in problems that exhibit certain low-rank structure. The proposed algorithms are based on continuous tensor decompositions recently developed by the authors. Essentially, the algorithms represent high-dimensional functions e.g. the value function in a compressed format, and directly perform dynamic programming computations e.g. value iteration, policy iteration in this format. Under certain technical assumptions, the new algorithms guarantee convergence towards optimal solutions with arbitrary precision. Furthermore, the run times of the new algorithms scale polynomially with the state dimension and polynomially with the ranks of the value function. This approach realizes substantial computational savings in źcompressibleź problem instances, where value functions admit low-rank approximations. We demonstrate the new algorithms in a wide range of problems, including a simulated six-dimensional agile quadcopter maneuvering example and a seven-dimensional aircraft perching example. In some of these examples, we estimate computational savings of up to 10 orders of magnitude over standard value iteration algorithms. We further demonstrate the algorithms running in real time on board a quadcopter during a flight experiment under motion capture.

...read moreread less

43 citations

Journal Article•DOI•

On the rate of convergence of finite-difference approximations for Bellman equations with constant coefficients

[...]

Hongjie Dong¹, Nicolai V Krylov¹•Institutions (1)

University of Minnesota¹

10 Feb 2006-St Petersburg Mathematical Journal

TL;DR: In this article, the authors considered parabolic Bellman equations with Lipschitz coefficients and obtained error bounds of order h 1/2 for certain types of finite-difference schemes.

...read moreread less

Abstract: We consider parabolic Bellman equations with Lipschitz coefficients. Error bounds of order h1/2 for certain types of finite-difference schemes are obtained.

...read moreread less

43 citations

Collapse

Network Information

Performance

Metrics

6,698

Papers

155,793

Citations

No. of papers in the topic in previous years
Year	Papers
2023	261
2022	537
2021	369
2020	411
2019	348
2018	353

Bellman equation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics