scispace - formally typeset

Bellman equation

About: Bellman equation is a(n) research topic. Over the lifetime, 5884 publication(s) have been published within this topic receiving 135589 citation(s).

More filters
Proceedings Article
29 Nov 1999
TL;DR: This paper proves for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.
Abstract: Function approximation is essential to reinforcement learning, but the standard approach of approximating a value function and determining a policy from it has so far proven theoretically intractable. In this paper we explore an alternative approach in which the policy is explicitly represented by its own function approximator, independent of the value function, and is updated according to the gradient of expected reward with respect to the policy parameters. Williams's REINFORCE method and actor-critic methods are examples of this approach. Our main new result is to show that the gradient can be written in a form suitable for estimation from experience aided by an approximate action-value or advantage function. Using this result, we prove for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.

4,394 citations

01 Jun 1979
TL;DR: In this article, an augmented edition of a respected text teaches the reader how to use linear quadratic Gaussian methods effectively for the design of control systems, with step-by-step explanations that show clearly how to make practical use of the material.
Abstract: This augmented edition of a respected text teaches the reader how to use linear quadratic Gaussian methods effectively for the design of control systems. It explores linear optimal control theory from an engineering viewpoint, with step-by-step explanations that show clearly how to make practical use of the material. The three-part treatment begins with the basic theory of the linear regulator/tracker for time-invariant and time-varying systems. The Hamilton-Jacobi equation is introduced using the Principle of Optimality, and the infinite-time problem is considered. The second part outlines the engineering properties of the regulator. Topics include degree of stability, phase and gain margin, tolerance of time delay, effect of nonlinearities, asymptotic properties, and various sensitivity problems. The third section explores state estimation and robust controller design using state-estimate feedback. Numerous examples emphasize the issues related to consistent and accurate system design. Key topics include loop-recovery techniques, frequency shaping, and controller reduction, for both scalar and multivariable systems. Self-contained appendixes cover matrix theory, linear systems, the Pontryagin minimum principle, Lyapunov stability, and the Riccati equation. Newly added to this Dover edition is a complete solutions manual for the problems appearing at the conclusion of each section.

3,137 citations

01 Jan 1989
TL;DR: In this article, a deterministic model of optimal growth is proposed, and a stochastic model is proposed for optimal growth with linear utility and linear systems and linear approximations.
Abstract: I. THE RECURSIVE APPROACH 1. Introduction 2. An Overview 2.1 A Deterministic Model of Optimal Growth 2.2 A Stochastic Model of Optimal Growth 2.3 Competitive Equilibrium Growth 2.4 Conclusions and Plans II. DETERMINISTIC MODELS 3. Mathematical Preliminaries 3.1 Metric Spaces and Normed Vector Spaces 3.2 The Contraction Mapping Theorem 3.3 The Theorem of the Maximum 4. Dynamic Programming under Certainty 4.1 The Principle of Optimality 4.2 Bounded Returns 4.3 Constant Returns to Scale 4.4 Unbounded Returns 4.5 Euler Equations 5. Applications of Dynamic Programming under Certainty 5.1 The One-Sector Model of Optimal Growth 5.2 A "Cake-Eating" Problem 5.3 Optimal Growth with Linear Utility 5.4 Growth with Technical Progress 5.5 A Tree-Cutting Problem 5.6 Learning by Doing 5.7 Human Capital Accumulation 5.8 Growth with Human Capital 5.9 Investment with Convex Costs 5.10 Investment with Constant Returns 5.11 Recursive Preferences 5.12 Theory of the Consumer with Recursive Preferences 5.13 A Pareto Problem with Recursive Preferences 5.14 An (s, S) Inventory Problem 5.15 The Inventory Problem in Continuous Time 5.16 A Seller with Unknown Demand 5.17 A Consumption-Savings Problem 6. Deterministic Dynamics 6.1 One-Dimensional Examples 6.2 Global Stability: Liapounov Functions 6.3 Linear Systems and Linear Approximations 6.4 Euler Equations 6.5 Applications III. STOCHASTIC MODELS 7. Measure Theory and Integration 7.1 Measurable Spaces 7.2 Measures 7.3 Measurable Functions 7.4 Integration 7.5 Product Spaces 7.6 The Monotone Class Lemma

2,943 citations

17 Nov 1975
TL;DR: In this paper, the authors considered the problem of optimal control of Markov diffusion processes in the context of calculus of variations, and proposed a solution to the problem by using the Euler Equation Extremals.
Abstract: I The Simplest Problem in Calculus of Variations.- 1. Introduction.- 2. Minimum Problems on an Abstract Space-Elementary Theory.- 3. The Euler Equation Extremals.- 4. Examples.- 5. The Jacobi Necessary Condition.- 6. The Simplest Problem in n Dimensions.- II The Optimal Control Problem.- 1. Introduction.- 2. Examples.- 3. Statement of the Optimal Control Problem.- 4. Equivalent Problems.- 5. Statement of Pontryagin's Principle.- 6. Extremals for the Moon Landing Problem.- 7. Extremals for the Linear Regulator Problem.- 8. Extremals for the Simplest Problem in Calculus of Variations.- 9. General Features of the Moon Landing Problem.- 10. Summary of Preliminary Results.- 11. The Free Terminal Point Problem.- 12. Preliminary Discussion of the Proof of Pontryagin's Principle.- 13. A Multiplier Rule for an Abstract Nonlinear Programming Problem.- 14. A Cone of Variations for the Problem of Optimal Control.- 15. Verification of Pontryagin's Principle.- III Existence and Continuity Properties of Optimal Controls.- 1. The Existence Problem.- 2. An Existence Theorem (Mayer Problem U Compact).- 3. Proof of Theorem 2.1.- 4. More Existence Theorems.- 5. Proof of Theorem 4.1.- 6. Continuity Properties of Optimal Controls.- IV Dynamic Programming.- 1. Introduction.- 2. The Problem.- 3. The Value Function.- 4. The Partial Differential Equation of Dynamic Programming.- 5. The Linear Regulator Problem.- 6. Equations of Motion with Discontinuous Feedback Controls.- 7. Sufficient Conditions for Optimality.- 8. The Relationship between the Equation of Dynamic Programming and Pontryagin's Principle.- V Stochastic Differential Equations and Markov Diffusion Processes.- 1. Introduction.- 2. Continuous Stochastic Processes Brownian Motion Processes.- 3. Ito's Stochastic Integral.- 4. Stochastic Differential Equations.- 5. Markov Diffusion Processes.- 6. Backward Equations.- 7. Boundary Value Problems.- 8. Forward Equations.- 9. Linear System Equations the Kalman-Bucy Filter.- 10. Absolutely Continuous Substitution of Probability Measures.- 11. An Extension of Theorems 5.1,5.2.- VI Optimal Control of Markov Diffusion Processes.- 1. Introduction.- 2. The Dynamic Programming Equation for Controlled Markov Processes.- 3. Controlled Diffusion Processes.- 4. The Dynamic Programming Equation for Controlled Diffusions a Verification Theorem.- 5. The Linear Regulator Problem (Complete Observations of System States).- 6. Existence Theorems.- 7. Dependence of Optimal Performance on y and ?.- 8. Generalized Solutions of the Dynamic Programming Equation.- 9. Stochastic Approximation to the Deterministic Control Problem.- 10. Problems with Partial Observations.- 11. The Separation Principle.- Appendices.- A. Gronwall-Bellman Inequality.- B. Selecting a Measurable Function.- C. Convex Sets and Convex Functions.- D. Review of Basic Probability.- E. Results about Parabolic Equations.- F. A General Position Lemma.

2,941 citations

18 Dec 1997
TL;DR: In this paper, the main ideas on a model problem with continuous viscosity solutions of Hamilton-Jacobi equations are discussed. But the main idea of the main solutions is not discussed.
Abstract: Preface.- Basic notations.- Outline of the main ideas on a model problem.- Continuous viscosity solutions of Hamilton-Jacobi equations.- Optimal control problems with continuous value functions: unrestricted state space.- Optimal control problems with continuous value functions: restricted state space.- Discontinuous viscosity solutions and applications.- Approximation and perturbation problems.- Asymptotic problems.- Differential Games.- Numerical solution of Dynamic Programming.- Nonlinear H-infinity control by Pierpaolo Soravia.- Bibliography.- Index

2,561 citations

Network Information
Related Topics (5)
Optimal control
68K papers, 1.2M citations
87% related
Bounded function
77.2K papers, 1.3M citations
85% related
Markov chain
51.9K papers, 1.3M citations
85% related
Linear system
59.5K papers, 1.4M citations
84% related
Optimization problem
96.4K papers, 2.1M citations
83% related
No. of papers in the topic in previous years