scispace - formally typeset
Search or ask a question
Topic

Bellman equation

About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, the finite-horizon optimal control design for nonlinear discrete-time systems in affine form is presented and the complete system dynamics are relaxed utilizing a neural network (NN)-based identifier to learn the control coefficient matrix.
Abstract: In this paper, the finite-horizon optimal control design for nonlinear discrete-time systems in affine form is presented. In contrast with the traditional approximate dynamic programming methodology, which requires at least partial knowledge of the system dynamics, in this paper, the complete system dynamics are relaxed utilizing a neural network (NN)-based identifier to learn the control coefficient matrix. The identifier is then used together with the actor-critic-based scheme to learn the time-varying solution, referred to as the value function, of the Hamilton-Jacobi-Bellman (HJB) equation in an online and forward-in-time manner. Since the solution of HJB is time-varying, NNs with constant weights and time-varying activation functions are considered. To properly satisfy the terminal constraint, an additional error term is incorporated in the novel update law such that the terminal constraint error is also minimized over time. Policy and/or value iterations are not needed and the NN weights are updated once a sampling instant. The uniform ultimate boundedness of the closed-loop system is verified by standard Lyapunov stability theory under nonautonomous analysis. Numerical examples are provided to illustrate the effectiveness of the proposed method.

67 citations

Journal ArticleDOI
TL;DR: This work derives HJB equations and applies them to two examples, a portfolio optimization and a systemic risk model, and shows that Bellman's principle applies to the dynamic programming value function V(\tau,\rho_\tau) where the dependency on $\rho$ is functional as in P.L. Lions' analysis of mean-filed games (2007).

67 citations

Journal ArticleDOI
TL;DR: In this article, the authors considered the problem of ergodicity of the Bellman equations of the type related to risk-sensitive control and proved that the problem in general has multiple solutions and classified the solutions by a global behavior of the diffusion process associated with the given solution.
Abstract: Bellman equations of ergodic type related to risk-sensitive control are considered. We treat the case that the nonlinear term is positive quadratic form on first-order partial derivatives of solution, which includes linear exponential quadratic Gaussian control problem. In this paper we prove that the equation in general has multiple solutions. We shall specify the set of all the classical solutions and classify the solutions by a global behavior of the diffusion process associated with the given solution. The solution associated with ergodic diffusion process plays particular role. We shall also prove the uniqueness of such solution. Furthermore, the solution which gives us ergodicity is stable under perturbation of coefficients. Finally, we have a representation result for the solution corresponding to the ergodic diffusion.

67 citations

Journal ArticleDOI
TL;DR: A data-based robust adaptive control methodology for a class of nonlinear constrained-input systems with completely unknown dynamics and the obtained approximate optimal control is verified to guarantee the unknown nonlinear system to be stable in the sense of uniform ultimate boundedness.

67 citations

Journal ArticleDOI
TL;DR: In this paper, the authors studied the long run average cost minimization of a stochastic inventory problem with Markovian demand, fixed ordering cost, and convex surplus cost.
Abstract: This paper is concerned with long-run average cost minimization of a stochastic inventory problem with Markovian demand, fixed ordering cost, and convex surplus cost. The states of the Markov chain represent different possible states of the environment. Using a vanishing discount approach, a dynamic programming equation and the corresponding verification theorem are established. Finally, the existence of an optimal state-dependent (s, S) policy is proved.

67 citations


Network Information
Related Topics (5)
Optimal control
68K papers, 1.2M citations
87% related
Bounded function
77.2K papers, 1.3M citations
85% related
Markov chain
51.9K papers, 1.3M citations
85% related
Linear system
59.5K papers, 1.4M citations
84% related
Optimization problem
96.4K papers, 2.1M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023261
2022537
2021369
2020411
2019348
2018353