scispace - formally typeset
Search or ask a question
Topic

Bellman equation

About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.


Papers
More filters
Proceedings Article
01 Jan 2002
TL;DR: This paper shows that each of the solutions is optimal with respect to a specific objective function and characterises the different solutions as images of the optimal exact value function under different projection operations.
Abstract: There are several reinforcement learning algorithms that yield approximate solutions for the problem of policy evaluation when the value function is represented with a linear function approximator. In this paper we show that each of the solutions is optimal with respect to a specific objective function. Moreover, we characterise the different solutions as images of the optimal exact value function under different projection operations. The results presented here will be useful for comparing the algorithms in terms of the error they achieve relative to the error of the optimal approximate solution.

56 citations

Journal ArticleDOI
TL;DR: It is argued that a failure to recognize the special features of the model in the context of which the principle was stated has resulted in the latter being misconstrued in the dynamic programming literature.
Abstract: New light is shed on Bellman's principle of optimality and the role it plays in Bellman's conception of dynamic programming. It is argued that a failure to recognize the special features of the model in the context of which the principle was stated has resulted in the latter being misconstrued in the dynamic programming literature.

56 citations

Journal ArticleDOI
TL;DR: Yan et al. as discussed by the authors studied stochastic optimal control problems with jumps with the help of the theory of Backward Stochastic Differential Equations (BSDEs) with jumps and proved that the value functions are the viscosity solutions of the associated generalized Hamilton-Jacobi-Bellman equations with integral differential operators.
Abstract: In this paper we study stochastic optimal control problems with jumps with the help of the theory of Backward Stochastic Differential Equations (BSDEs) with jumps. We generalize the results of Peng [S. Peng, BSDE and stochastic optimizations, in: J. Yan, S. Peng, S. Fang, L. Wu, Topics in Stochastic Analysis, Science Press, Beijing, 1997 (Chapter 2) (in Chinese)] by considering cost functionals defined by controlled BSDEs with jumps. The application of BSDE methods, in particular, the use of the notion of stochastic backward semigroups introduced by Peng in the above-mentioned work allows a straightforward proof of a dynamic programming principle for value functions associated with stochastic optimal control problems with jumps. We prove that the value functions are the viscosity solutions of the associated generalized Hamilton–Jacobi–Bellman equations with integral-differential operators. For this proof, we adapt Peng’s BSDE approach, given in the above-mentioned reference, developed in the framework of stochastic control problems driven by Brownian motion to that of stochastic control problems driven by Brownian motion and Poisson random measure.

55 citations

Journal ArticleDOI
TL;DR: In this paper, the authors studied the stochastic optimal control problem of fully coupled forward-backward stochastically differential equations (FBSDEs) and proved that the value functions are deterministic, satisfy the dynamic programming principle, and are viscosity solutions.
Abstract: In this paper we study stochastic optimal control problems of fully coupled forward-backward stochastic differential equations (FBSDEs). The recursive cost functionals are defined by controlled fully coupled FBSDEs. We use a new method to prove that the value functions are deterministic, satisfy the dynamic programming principle, and are viscosity solutions to the associated generalized Hamilton--Jacobi--Bellman (HJB) equations. For this we generalize the notion of stochastic backward semigroup introduced by Peng Topics on Stochastic Analysis, Science Press, Beijing, 1997, pp. 85--138. We emphasize that when $\sigma$ depends on the second component of the solution $(Y, Z)$ of the BSDE it makes the stochastic control much more complicated and has as a consequence that the associated HJB equation is combined with an algebraic equation. We prove that the algebraic equation has a unique solution, and moreover, we also give the representation for this solution. On the other hand, we prove a new local existence...

55 citations

Proceedings Article
28 Jun 2001
TL;DR: It is proved that if an MDP possesses a symmetry, then the optimal value function andQ function are similarly symmetric and there exists a symmetric optimal policy.
Abstract: This paper examines the notion of symmetry in Markov decision processes (MDPs). We define symmetry for an MDP and show how it can be exploited for more effective learning in single agent systems as well as multiagent systems and multirobot systems. We prove that if an MDP possesses a symmetry, then the optimal value function andQ function are similarly symmetric and there exists a symmetric optimal policy. If an MDP is known to possess a symmetry, this knowledge can be applied to decrease the number of training examples needed for algorithms like Q learning and value iteration. It can also be used to directly restrict the hypothesis space.

55 citations


Network Information
Related Topics (5)
Optimal control
68K papers, 1.2M citations
87% related
Bounded function
77.2K papers, 1.3M citations
85% related
Markov chain
51.9K papers, 1.3M citations
85% related
Linear system
59.5K papers, 1.4M citations
84% related
Optimization problem
96.4K papers, 2.1M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023261
2022537
2021369
2020411
2019348
2018353