scispace - formally typeset
Search or ask a question
Topic

Bellman equation

About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this article, a perturbation approach for performing sensitivity analysis of mathematical programming problems is presented, where the active constraints are not assumed to remain active if the problem data are perturbed, nor the partial derivatives are assumed to exist.
Abstract: This paper presents a perturbation approach for performing sensitivity analysis of mathematical programming problems. Contrary to standard methods, the active constraints are not assumed to remain active if the problem data are perturbed, nor the partial derivatives are assumed to exist. In other words, all the elements, variables, parameters, Karush–Kuhn–Tucker multipliers, and objective function values may vary provided that optimality is maintained and the general structure of a feasible perturbation (which is a polyhedral cone) is obtained. This allows determining: (a) the local sensitivities, (b) whether or not partial derivatives exist, and (c) if the directional derivative for a given direction exists. A method for the simultaneous obtention of the sensitivities of the objective function optimal value and the primal and dual variable values with respect to data is given. Three examples illustrate the concepts presented and the proposed methodology. Finally, some relevant conclusions are drawn.

84 citations

Proceedings Article
03 Jan 2001
TL;DR: This paper presents a simple approach for computing reasonable policies for factored Markov decision processes (MDPs), when the optimal value function can be approximated by a compact linear form.
Abstract: We present a simple approach for computing reasonable policies for factored Markov decision processes (MDPs), when the optimal value function can be approximated by a compact linear form. Our method is based on solving a single linear program that approximates the best linear fit to the optimal value function. By applying an efficient constraint generation procedure we obtain an iterative solution method that tackles concise linear programs. This direct linear programming approach experimentally yields a significant reduction in computation time over approximate value- and policy-iteration methods (sometimes reducing several hours to a few seconds). However, the quality of the solutions produced by linear programming is weaker—usually about twice the approximation error for the same approximating class. Nevertheless, the speed advantage allows one to use larger approximation classes to achieve similar error in reasonable time.

84 citations

Journal ArticleDOI
TL;DR: In this paper, the authors consider continuous-state and continuous-time control problems where the admissible trajectories of the system are constrained to remain on a network and prove that the value function is the unique constrained viscosity solution of the Hamilton-Jacobi equation on the network.
Abstract: We consider continuous-state and continuous-time control problems where the admissible trajectories of the system are constrained to remain on a network. In our setting, the value function is continuous. We de ne a notion of constrained viscosity solution of Hamilton-Jacobi equations on the network and we study related comparison principles. Under suitable assumptions, we prove in particular that the value function is the unique constrained viscosity solution of the Hamilton-Jacobi equation on the network.

84 citations

Journal ArticleDOI
TL;DR: This work proves convergence of an approximate dynamic programming algorithm for a class of high-dimensional stochastic control problems linked by a scalar storage device, given a technical condition.
Abstract: We prove convergence of an approximate dynamic programming algorithm for a class of high-dimensional stochastic control problems linked by a scalar storage device, given a technical condition. Our problem is motivated by the problem of optimizing energy flows for a power grid supported by grid-level storage. The problem is formulated as a stochastic, dynamic program, where we estimate the value of resources in storage using a piecewise linear value function approximation. Given the technical condition, we provide a rigorous convergence proof for an approximate dynamic programming algorithm, which can capture the presence of both the amount of energy held in storage as well as other exogenous variables. Our algorithm exploits the natural concavity of the problem to avoid any need for explicit exploration policies.

84 citations

Journal ArticleDOI
TL;DR: In this paper, a functional central limit theorem (CLT) was established for mean field games, which characterizes the limiting fluctuations around the LLN limit as the unique solution of a linear stochastic PDE.
Abstract: Mean field games (MFGs) describe the limit, as $n$ tends to infinity, of stochastic differential games with $n$ players interacting with one another through their common empirical distribution. Under suitable smoothness assumptions that guarantee uniqueness of the MFG equilibrium, a form of law of large of numbers (LLN), also known as propagation of chaos, has been established to show that the MFG equilibrium arises as the limit of the sequence of empirical measures of the $n$-player game Nash equilibria, including the case when player dynamics are driven by both idiosyncratic and common sources of noise. The proof of convergence relies on the so-called master equation for the value function of the MFG, a partial differential equation on the space of probability measures. In this work, under additional assumptions, we establish a functional central limit theorem (CLT) that characterizes the limiting fluctuations around the LLN limit as the unique solution of a linear stochastic PDE. The key idea is to use the solution to the master equation to construct an associated McKean-Vlasov interacting $n$-particle system that is sufficiently close to the Nash equilibrium dynamics of the $n$-player game for large $n$. We then derive the CLT for the latter from the CLT for the former. Along the way, we obtain a new multidimensional CLT for McKean-Vlasov systems. We also illustrate the broader applicability of our methodology by applying it to establish a CLT for a specific linear-quadratic example that does not satisfy our main assumptions, and we explicitly solve the resulting stochastic PDE in this case.

84 citations


Network Information
Related Topics (5)
Optimal control
68K papers, 1.2M citations
87% related
Bounded function
77.2K papers, 1.3M citations
85% related
Markov chain
51.9K papers, 1.3M citations
85% related
Linear system
59.5K papers, 1.4M citations
84% related
Optimization problem
96.4K papers, 2.1M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023261
2022537
2021369
2020411
2019348
2018353