Home
/
Topics
/
Bellman equation

Topic

Bellman equation

About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A derivative-free method for solving elliptic partial differential equations with deep neural networks

[...]

Jihun Han¹, Mihai Nica², Adam R. Stinchcombe¹•Institutions (2)

University of Toronto¹, University of Guelph²

15 Oct 2020-Journal of Computational Physics

TL;DR: A deep neural network based method for solving a class of elliptic partial differential equations and is a ‘Derivative-Free Loss Method’ since it does not require the explicit calculation of the derivatives of the neural network with respect to the input neurons in order to compute the training loss.

...read moreread less

39 citations

Journal Article•DOI•

Principled reward shaping for reinforcement learning via lyapunov stability theory

[...]

Yunlong Dong¹, Xiuchuan Tang¹, Ye Yuan¹•Institutions (1)

Huazhong University of Science and Technology¹

14 Jun 2020-Neurocomputing

TL;DR: It has been verified that the proposed Lyapunov function based approach to shape the reward function which can effectively accelerate the training substantially accelerates the convergence process as well as improves the performance in terms of a higher accumulated reward.

...read moreread less

38 citations

Proceedings Article•DOI•

Robust output feedback stabilizability via controller switching

[...]

Andrey V. Savkin, Robin J. Evans

11 Dec 1996

TL;DR: A necessary and sufficient condition of absolute stabilizability is given in terms of the existence of suitable solutions to a dynamic programming equation and a Riccati algebraic equation of the H/sup /spl infin// filtering type.

...read moreread less

Abstract: The paper considers the output feedback robust stabilizability problem for hybrid dynamical systems. The hybrid system under consideration is a composite of a continuous plant and a discrete event controller. A necessary and sufficient condition of absolute stabilizability is given in terms of the existence of suitable solutions to a dynamic programming equation and a Riccati algebraic equation of the H/sup /spl infin// filtering type. A real time implementable method for absolute stabilization is also presented.

...read moreread less

38 citations

Journal Article•DOI•

A simultaneous perturbation stochastic approximation-based actor-critic algorithm for Markov decision processes

[...]

Shalabh Bhatnagar¹, Sandeep Kumar¹•Institutions (1)

Indian Institute of Science¹

13 Apr 2004-IEEE Transactions on Automatic Control

TL;DR: A two-timescale simulation-based actor-critic algorithm for solution of infinite horizon Markov decision processes with finite state and compact action spaces under the discounted cost criterion is proposed and the proof of convergence to a locally optimal policy is presented.

...read moreread less

Abstract: A two-timescale simulation-based actor-critic algorithm for solution of infinite horizon Markov decision processes with finite state and compact action spaces under the discounted cost criterion is proposed. The algorithm does gradient search on the slower timescale in the space of deterministic policies and uses simultaneous perturbation stochastic approximation-based estimates. On the faster scale, the value function corresponding to a given stationary policy is updated and averaged over a fixed number of epochs (for enhanced performance). The proof of convergence to a locally optimal policy is presented. Finally, numerical experiments using the proposed algorithm on flow control in a bottleneck link using a continuous time queueing model are shown.

...read moreread less

38 citations

Proceedings Article•DOI•

Gradient descent approaches to neural-net-based solutions of the Hamilton-Jacobi-Bellman equation

[...]

Rémi Munos¹, Leemon C. Baird¹, Andrew W. Moore•Institutions (1)

Carnegie Mellon University¹

10 Jul 1999

TL;DR: This work uses neural networks to approximate the solution to the Hamilton-Jacobi-Bellman (HJB) equation which is a first-order, nonlinear, partial differential equation, and derives the gradient descent rule for integrating this equation inside the domain, given the conditions on the boundary.

...read moreread less

Abstract: We investigate new approaches to dynamic-programming-based optimal control of continuous time-and-space systems. We use neural networks to approximate the solution to the Hamilton-Jacobi-Bellman (HJB) equation which is a first-order, nonlinear, partial differential equation. We derive the gradient descent rule for integrating this equation inside the domain, given the conditions on the boundary. We apply this approach to the "car-on-the-hill" which is a 2D highly nonlinear control problem. We discuss the results obtained and point out a low quality of approximation of the value function and of the derived control. We attribute this bad approximation to the fact that the HJB equation has many generalized solutions other than the value function, and our gradient descent method converges to one among these functions, thus possibly failing to find the correct value function. We illustrate this limitation on a simple 1D control problem.

...read moreread less

38 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
…
164
165
166
167
168
169
170
…
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

6,698

Papers

155,793

Citations

No. of papers in the topic in previous years
Year	Papers
2023	261
2022	537
2021	369
2020	411
2019	348
2018	353

Bellman equation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics