scispace - formally typeset
Search or ask a question
Topic

Bellman equation

About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors provide a rationale for central place theory via a dynamic programming formulation of the social planner's problem of city hierarchy, and show that there must be one and only one immediate smaller city between two neighboring larger-sized cities in any optimal solution.

39 citations

Journal ArticleDOI
TL;DR: In this article, an asymptotic analysis of hierarchical production planning in a manufacturing system with serial machines that are subject to breakdown and repair, and with convex costs is presented.
Abstract: This paper presents an asymptotic analysis of hierarchical production planning in a manufacturing system with serial machines that are subject to breakdown and repair, and with convex costs. The machines capacities are modeled as Markov chains. Since the number of parts in the internal buffers between any two machines needs to be non-negative, the problem is inherently a state constrained problem. As the rate of change in machines states approaches infinity, the analysis results in a limiting problem in which the stochastic machines capacity is replaced by the equilibrium mean capacity. A method of “lifting” and “modification” is introduced in order to construct near optimal controls for the original problem by using near optimal controls of the limiting problem. The value function of the original problem is shown to converge to the value function of the limiting problem, and the convergence rate is obtained based on some a priori estimates of the asymptotic behavior of the Markov chains. As a result, an ...

39 citations

Proceedings Article
22 Jul 2012
TL;DR: This work shows how the continuous action maximization step in the dynamic programming backup can be evaluated optimally and symbolically and further integrates this technique to work with an efficient and compact data structure for SDP -- the extended algebraic decision diagram (XADD).
Abstract: Many real-world decision-theoretic planning problems are naturally modeled using both continuous state and action (CSA) spaces, yet little work has provided exact solutions for the case of continuous actions. In this work, we propose a symbolic dynamic programming (SDP) solution to obtain the optimal closed-form value function and policy for CSA-MDPs with multivariate continuous state and actions, discrete noise, piecewise linear dynamics, and piecewise linear (or restricted piecewise quadratic) reward. Our key contribution over previous SDP work is to show how the continuous action maximization step in the dynamic programming backup can be evaluated optimally and symbolically -- a task which amounts to symbolic constrained optimization subject to unknown state parameters; we further integrate this technique to work with an efficient and compact data structure for SDP -- the extended algebraic decision diagram (XADD). We demonstrate empirical results on a didactic nonlinear planning example and two domains from operations research to show the first automated exact solution to these problems.

39 citations

Journal ArticleDOI
TL;DR: A numerical relaxation framework is developed to efficiently compute a control strategy with a guaranteed performance upper bound and it is proved that by choosing the relaxation parameter sufficiently small, the performance of the resulting control strategy can be made arbitrarily close to the optimal one.

39 citations

Journal ArticleDOI
TL;DR: A new self-learning parallel control method, which is based on adaptive dynamic programming (ADP) technique, is developed for solving the optimal control problem of discrete- time time-varying nonlinear systems and it aims to obtain an approximate optimal control law sequence.
Abstract: In this article, a new self-learning parallel control method, which is based on adaptive dynamic programming (ADP) technique, is developed for solving the optimal control problem of discrete- time time-varying nonlinear systems. It aims to obtain an approximate optimal control law sequence and simultaneously guarantees the convergence of the value function. Establishing the time-varying artificial system by neural networks in a certain time-horizon, a control-sequence-improvement ADP algorithm is developed to obtain the control law sequence. For the first time, the criteria of the parallel execution are presented, such that the value function is proven to converge to a finite neighborhood of the optimal performance index function. Finally, numerical results and analysis are presented to demonstrate the effectiveness of the parallel control method.

39 citations


Network Information
Related Topics (5)
Optimal control
68K papers, 1.2M citations
87% related
Bounded function
77.2K papers, 1.3M citations
85% related
Markov chain
51.9K papers, 1.3M citations
85% related
Linear system
59.5K papers, 1.4M citations
84% related
Optimization problem
96.4K papers, 2.1M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023268
2022556
2021375
2020418
2019353
2018356