Topic

Bellman equation

About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Neurodynamic Programming and Zero-Sum Games for Constrained Control Systems

[...]

Murad Abu-Khalaf¹, Frank L. Lewis², Jie Huang³•Institutions (3)

MathWorks¹, University of Texas at Arlington², The Chinese University of Hong Kong³

01 Jul 2008-IEEE Transactions on Neural Networks

TL;DR: In this paper, neural networks are used along with two-player policy iterations to solve for the feedback strategies of a continuous-time zero-sum game that appears in L2-gain optimal control, suboptimal Hinfin control, of nonlinear systems affine in input with the control policy having saturation constraints.

...read moreread less

Abstract: In this paper, neural networks are used along with two-player policy iterations to solve for the feedback strategies of a continuous-time zero-sum game that appears in L2-gain optimal control, suboptimal Hinfin control, of nonlinear systems affine in input with the control policy having saturation constraints. The result is a closed-form representation, on a prescribed compact set chosen a priori, of the feedback strategies and the value function that solves the associated Hamilton-Jacobi-Isaacs (HJI) equation. The closed-loop stability, L2-gain disturbance attenuation of the neural network saturated control feedback strategy, and uniform convergence results are proven. Finally, this approach is applied to the rotational/translational actuator (RTAC) nonlinear benchmark problem under actuator saturation, offering guaranteed stability and disturbance attenuation.

...read moreread less

173 citations

DOI•

Algorithms for partially observable markov decision processes

[...]

Hsien-Te Cheng, Shelby Brumelle

01 Jan 1989

TL;DR: The thesis develops methods to solve discrete-time finite-state partially observable Markov decision processes and proves that the policy improvement step in iterative discretization procedure can be replaced by the approximation version of linear support algorithm.

...read moreread less

Abstract: The thesis develops methods to solve discrete-time finite-state partially observable Markov decision processes. For the infinite horizon problem, only discounted reward case is considered. For the finite horizon problem, two new algorithms are developed. The first algorithm is called the relaxed region algorithm. For each support in the value function, this algorithm determines a region not smaller than its support region and modifies it implicitly in later steps until the exact support region is found. The second algorithm, called linear support algorithm, systematically approximates the value function until all supports in the value function are found. The most important feature of this algorithm is that it can be modified to find an approximate value function. It has been shown that these two algorithms are more efficient than the one-pass algorithm. For the infinite horizon problem, it is first shown that the approximation version of linear support algorithm can be used to substitute the policy improvement step in a standard successive approximation method to obtain an $\epsilon$-optimal value function. Next, an iterative discretization procedure is developed which uses a small number of states to find new supports and improve the value function between two policy improvement steps. Since only a finite number of states are chosen in this process, some techniques developed for finite MDP can be applied here. Finally, we prove that the policy improvement step in iterative discretization procedure can be replaced by the approximation version of linear support algorithm. The last part of the thesis deals with problems with continuous signals. We first show that if the signal processes are uniformly distributed, then the problem can be reformulated as a problem with finite number of signals. Then the result is extended to where the signal processes are step functions. Since step functions can be easily used to approximate most of the probability distributions, this method can be used to approximate most of the problems with continuous signals. Finally, we present some conditions which guarantee that the linear support can be computed for any given state, then the methods developed for finite signal cases can be easily modified and applied to problems for which the conditions hold.

...read moreread less

173 citations

Journal Article•DOI•

Lax–Hopf Based Incorporation of Internal Boundary Conditions Into Hamilton–Jacobi Equation. Part I: Theory

[...]

Christian Claudel¹, Alexandre M. Bayen¹•Institutions (1)

University of California, Berkeley¹

02 Feb 2010-IEEE Transactions on Automatic Control

TL;DR: This article proposes a new approach for computing a semi-explicit form of the solution to a class of Hamilton-Jacobi (HJ) partial differential equations (PDEs), using control techniques based on viability theory.

...read moreread less

Abstract: This article proposes a new approach for computing a semi-explicit form of the solution to a class of Hamilton-Jacobi (HJ) partial differential equations (PDEs), using control techniques based on viability theory. We characterize the epigraph of the value function solving the HJ PDE as a capture basin of a target through an auxiliary dynamical system, called ?characteristic system?. The properties of capture basins enable us to define components as building blocks of the solution to the HJ PDE in the Barron/Jensen-Frankowska sense. These components can encode initial conditions, boundary conditions, and internal ?boundary? conditions, which are the topic of this article. A generalized Lax-Hopf formula is derived, and enables us to formulate the necessary and sufficient conditions for a mixed initial and boundary conditions problem with multiple internal boundary conditions to be well posed. We illustrate the capabilities of the method with a data assimilation problem for reconstruction of highway traffic flow using Lagrangian measurements generated from Next Generation Simulation (NGSIM) traffic data.

...read moreread less

172 citations

Journal Article•DOI•

Wide-sense adaptive dual control for nonlinear stochastic systems

[...]

Edison Tse, Y. Bar-Shalom, L. Meier

01 Apr 1973-IEEE Transactions on Automatic Control

TL;DR: In this paper, a new approach is presented for the problem of stochastic control of nonlinear systems, which takes into account the past observations and also the future observation program.

...read moreread less

Abstract: A new approach is presented for the problem of stochastic control of nonlinear systems. It is well known that, except for the linear-quadratic problem, the optimal stochastic controller cannot be obtained in practice. In general it is the curse of dimensionality that makes the strict application of the principle of optimality infeasible. The two subproblems of stochastic control, estimation and control proper, are, except for the linear-quadratic case, intercoupled. As pointed out by Feldbaum, in addition to its effects on the state of the system, the control also affects the estimation performance. In this paper, the control problem is formulated such that this dual property of the control appears explicitly. The resulting control sequence exhibits the closed-loop property, i.e., it takes into account the past observations and also the future observation program. Thus, in addition to being adaptive, this control also plans its future learning according to the control objective. Some preliminary simulation results illustrate these properties of the control.

...read moreread less

172 citations

Journal Article•DOI•

Error bounds for monotone approximation schemes for parabolic Hamilton-Jacobi-Bellman equations

[...]

Guy Barles¹, Espen R. Jakobsen²•Institutions (2)

François Rabelais University¹, Norwegian University of Science and Technology²

01 Oct 2007-Mathematics of Computation

TL;DR: The nonsymmetric upper and lower bounds on the rate of convergence of general monotone approximation/numerical schemes for parabolic Hamilton-Jacobi-Bellman equations are obtained by introducing a new notion of consistency.

...read moreread less

Abstract: . We obtain nonsymmetric upper and lower bounds on the rate of convergence of general monotone approximation/numerical schemes for parabolic Hamilton-Jacobi-Bellman equations by introducing a new notion of consistency. Our results are robust and general - they improve and extend earlier results by Krylov, Barles, and Jakobsen. We apply our general results to various schemes including Crank-Nicholson type finite difference schemes, splitting methods, and the classical approximation by piecewise constant controls. In the first two cases our results are new, and in the last two cases the results are obtained by a new method which we develop here.

...read moreread less

171 citations

Collapse

Network Information

Performance

Metrics

6,698

Papers

155,793

Citations

No. of papers in the topic in previous years
Year	Papers
2023	261
2022	537
2021	369
2020	411
2019	348
2018	353

Bellman equation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics