scispace - formally typeset
Search or ask a question
Topic

Bellman equation

About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A deep neural network based method for solving a class of elliptic partial differential equations and is a ‘Derivative-Free Loss Method’ since it does not require the explicit calculation of the derivatives of the neural network with respect to the input neurons in order to compute the training loss.

39 citations

Journal ArticleDOI
TL;DR: It has been verified that the proposed Lyapunov function based approach to shape the reward function which can effectively accelerate the training substantially accelerates the convergence process as well as improves the performance in terms of a higher accumulated reward.

38 citations

Proceedings ArticleDOI
11 Dec 1996
TL;DR: A necessary and sufficient condition of absolute stabilizability is given in terms of the existence of suitable solutions to a dynamic programming equation and a Riccati algebraic equation of the H/sup /spl infin// filtering type.
Abstract: The paper considers the output feedback robust stabilizability problem for hybrid dynamical systems. The hybrid system under consideration is a composite of a continuous plant and a discrete event controller. A necessary and sufficient condition of absolute stabilizability is given in terms of the existence of suitable solutions to a dynamic programming equation and a Riccati algebraic equation of the H/sup /spl infin// filtering type. A real time implementable method for absolute stabilization is also presented.

38 citations

Journal ArticleDOI
TL;DR: A two-timescale simulation-based actor-critic algorithm for solution of infinite horizon Markov decision processes with finite state and compact action spaces under the discounted cost criterion is proposed and the proof of convergence to a locally optimal policy is presented.
Abstract: A two-timescale simulation-based actor-critic algorithm for solution of infinite horizon Markov decision processes with finite state and compact action spaces under the discounted cost criterion is proposed. The algorithm does gradient search on the slower timescale in the space of deterministic policies and uses simultaneous perturbation stochastic approximation-based estimates. On the faster scale, the value function corresponding to a given stationary policy is updated and averaged over a fixed number of epochs (for enhanced performance). The proof of convergence to a locally optimal policy is presented. Finally, numerical experiments using the proposed algorithm on flow control in a bottleneck link using a continuous time queueing model are shown.

38 citations

Proceedings ArticleDOI
10 Jul 1999
TL;DR: This work uses neural networks to approximate the solution to the Hamilton-Jacobi-Bellman (HJB) equation which is a first-order, nonlinear, partial differential equation, and derives the gradient descent rule for integrating this equation inside the domain, given the conditions on the boundary.
Abstract: We investigate new approaches to dynamic-programming-based optimal control of continuous time-and-space systems. We use neural networks to approximate the solution to the Hamilton-Jacobi-Bellman (HJB) equation which is a first-order, nonlinear, partial differential equation. We derive the gradient descent rule for integrating this equation inside the domain, given the conditions on the boundary. We apply this approach to the "car-on-the-hill" which is a 2D highly nonlinear control problem. We discuss the results obtained and point out a low quality of approximation of the value function and of the derived control. We attribute this bad approximation to the fact that the HJB equation has many generalized solutions other than the value function, and our gradient descent method converges to one among these functions, thus possibly failing to find the correct value function. We illustrate this limitation on a simple 1D control problem.

38 citations


Network Information
Related Topics (5)
Optimal control
68K papers, 1.2M citations
87% related
Bounded function
77.2K papers, 1.3M citations
85% related
Markov chain
51.9K papers, 1.3M citations
85% related
Linear system
59.5K papers, 1.4M citations
84% related
Optimization problem
96.4K papers, 2.1M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023261
2022537
2021369
2020411
2019348
2018353