scispace - formally typeset
Search or ask a question
Topic

Bellman equation

About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.


Papers
More filters
Posted Content
TL;DR: This paper rigorously formulate the general sequential optimal experimental design (sOED) problem as a dynamic program, adopting a Bayesian formulation with an information theoretic design objective, and develops new numerical approaches for nonlinear design with continuous parameter, design, and observation spaces.
Abstract: The design of multiple experiments is commonly undertaken via suboptimal strategies, such as batch (open-loop) design that omits feedback or greedy (myopic) design that does not account for future effects. This paper introduces new strategies for the optimal design of sequential experiments. First, we rigorously formulate the general sequential optimal experimental design (sOED) problem as a dynamic program. Batch and greedy designs are shown to result from special cases of this formulation. We then focus on sOED for parameter inference, adopting a Bayesian formulation with an information theoretic design objective. To make the problem tractable, we develop new numerical approaches for nonlinear design with continuous parameter, design, and observation spaces. We approximate the optimal policy by using backward induction with regression to construct and refine value function approximations in the dynamic program. The proposed algorithm iteratively generates trajectories via exploration and exploitation to improve approximation accuracy in frequently visited regions of the state space. Numerical results are verified against analytical solutions in a linear-Gaussian setting. Advantages over batch and greedy design are then demonstrated on a nonlinear source inversion problem where we seek an optimal policy for sequential sensing.

54 citations

Journal ArticleDOI
TL;DR: A key structural property for the decision function is proved, and this property is exploited in the development of continuous value function approximations that form the basis of an approximate dispatch rule.
Abstract: We address the problem of dispatching a vehicle with different product classes. There is a common dispatch cost, but holding costs that vary by product class. The problem exhibits multidimensional state, outcome and action spaces, and as a result is computationally intractable using either discrete dynamic programming methods, or even as a deterministic integer program. We prove a key structural property for the decision function, and exploit this property in the development of continuous value function approximations that form the basis of an approximate dispatch rule. Comparisons on single product-class problems, where optimal solutions are available, demonstrate solutions that are within a few percent of optimal. The algorithm is then applied to a problem with 100 product classes, and comparisons against a carefully tuned myopic heuristic demonstrate significant improvements. © 2003 Wiley Periodicals, Inc. Naval Research Logistics 50: 742–769, 2003.

54 citations

Journal ArticleDOI
TL;DR: This paper presents a stochastic dynamic programming formulation of the Dynamic Integrated Model of Climate and the Economy (DICE), and the application of approximate dynamic programming techniques to numerically solve for the optimal policy under uncertain and decision-dependent technological change in a multi-stage setting.
Abstract: Analyses of global climate policy as a sequential decision under uncertainty have been severely restricted by dimensionality and computational burdens. Therefore, they have limited the number of decision stages, discrete actions, or number and type of uncertainties considered. In particular, two common simplifications are the use of two-stage models to approximate a multi-stage problem and exogenous formulations for inherently endogenous or decision-dependent uncertainties (in which the shock at time t+1 depends on the decision made at time t). In this paper, we present a stochastic dynamic programming formulation of the Dynamic Integrated Model of Climate and the Economy (DICE), and the application of approximate dynamic programming techniques to numerically solve for the optimal policy under uncertain and decision-dependent technological change in a multi-stage setting. We compare numerical results using two alternative value function approximation approaches, one parametric and one non-parametric. We show that increasing the variance of a symmetric mean-preserving uncertainty in abatement costs leads to higher optimal first-stage emission controls, but the effect is negligible when the uncertainty is exogenous. In contrast, the impact of decision-dependent cost uncertainty, a crude approximation of technology R&D, on optimal control is much larger, leading to higher control rates (lower emissions). Further, we demonstrate that the magnitude of this effect grows with the number of decision stages represented, suggesting that for decision-dependent phenomena, the conventional two-stage approximation will lead to an underestimate of the effect of uncertainty.

54 citations

Journal ArticleDOI
TL;DR: This paper proposes a particular form of the problem that exposes some useful properties of the gauge optimization framework (such as the variational properties of its value function), and yet maintains most of the generality of the abstract form of gauge optimization.
Abstract: Gauge functions significantly generalize the notion of a norm, and gauge optimization, as defined by [R. M. Freund, Math. Programming, 38 (1987), pp. 47--67], seeks the element of a convex set that is minimal with respect to a gauge function. This conceptually simple problem can be used to model a remarkable array of useful problems, including a special case of conic optimization, and related problems that arise in machine learning and signal processing. The gauge structure of these problems allows for a special kind of duality framework. This paper explores the duality framework proposed by Freund, and proposes a particular form of the problem that exposes some useful properties of the gauge optimization framework (such as the variational properties of its value function), and yet maintains most of the generality of the abstract form of gauge optimization.

54 citations

Journal ArticleDOI
TL;DR: The general framework deals with the important case when several consecutive orders may be decided before the effective execution of the first one, motivated by financial applications in the trading of illiquid assets such as hedge funds.

53 citations


Network Information
Related Topics (5)
Optimal control
68K papers, 1.2M citations
87% related
Bounded function
77.2K papers, 1.3M citations
85% related
Markov chain
51.9K papers, 1.3M citations
85% related
Linear system
59.5K papers, 1.4M citations
84% related
Optimization problem
96.4K papers, 2.1M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023261
2022537
2021369
2020411
2019348
2018353