Topic
Bellman equation
About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.
Papers published on a yearly basis
Papers
More filters
••
TL;DR: In this article, the authors considered the scheduling control problem for a family of unitary networks under heavy traffic, with general interarrival and service times, probabilistic routing and infinite horizon discounted linear holding cost.
Abstract: We consider the scheduling control problem for a family of unitary networks under heavy traffic, with general interarrival and service times, probabilistic routing and infinite horizon discounted linear holding cost. A natural nonanticipativity condition for admissibility of control policies is introduced. The condition is seen to hold for a broad class of problems. Using this formulation of admissible controls and a time-transformation technique, we establish that the infimum of the cost for the network control problem over all admissible sequencing control policies is asymptotically bounded below by the value function of an associated diffusion control problem (the Brownian control problem). This result provides a useful bound on the best achievable performance for any admissible control policy for a wide class of networks.
49 citations
••
TL;DR: In this paper, a simple machine replacement and maintenance framework is used to highlight the difficulties of estimating the required value function numerically, and a properly modified projection algorithm is proposed for finite difference methods.
49 citations
•
TL;DR: In this paper, the Dirichlet problem for the underlying PDE was studied and an optimal regularity result was derived for the interior solution in the interior, despite the fact that the solution need not be continuous up to the boundary, and the convex envelope was characterized as the value function of a stochastic control problem.
Abstract: The Convex Envelope of a given function was recently characterized as the solution of a fully nonlinear Partial Differential Equation (PDE). In this article we study a modified problem: the Dirichlet problem for the underlying PDE. The main result is an optimal regularity result. Differentiability ($C^{1,\alpha}$ regularity) of the boundary data implies the corresponding result for the solution in the interior, despite the fact that the solution need not be continuous up to the boundary. Secondary results are the characterization of the convex envelope as: (i) the value function of a stochastic control problem, and (ii) the optimal underestimator for a class of nonlinear elliptic PDEs.
49 citations
••
TL;DR: A game-theory based approach in a multi–target searching using a multi-robot system in a dynamic environment with main advantage in its real-time capabilities whilst being efficient and robust to dynamic environments.
Abstract: This paper proposes a game-theory based approach in a multi-target searching using a multi-robot system in a dynamic environment. It is assumed that a rough priori probability map of the targets' distribution within the environment is given. To consider the interaction between the robots, a dynamic-programming equation is proposed to estimate the utility function for each robot. Based on this utility function, a cooperative nonzero-sum game is generated, where both pure Nash Equilibrium and mixed-strategy Equilibrium solutions are presented to achieve an optimal overall robot behaviors. A special consideration has been taken to improve the real-time performance of the game-theory based approach. Several mechanisms, such as event-driven discretization, one-step dynamic programming, and decision buffer, have been proposed to reduce the computational complexity. The main advantage of the algorithm lies in its real-time capabilities whilst being efficient and robust to dynamic environments.
49 citations
••
TL;DR: A novel mean-field framework is proposed that offers a more efficient modeling tool and a more accurate solution scheme in tackling directly the issue of nonseparability and deriving the optimal policies analytically for the multi-period mean-variance-type portfolio selection problems.
Abstract: When a dynamic optimization problem is not decomposable by a stage-wise backward recursion, it is nonseparable in the sense of dynamic programming. The classical dynamic programming-based optimal stochastic control methods would fail in such nonseparable situations as the principle of optimality no longer applies. Among these notorious nonseparable problems, the dynamic mean-variance portfolio selection formulation had posed a great challenge to our research community until recently. Different from the existing literature that invokes embedding schemes and auxiliary parametric formulations to solve the dynamic mean-variance portfolio selection formulation, we propose in this paper a novel mean-field framework that offers a more efficient modeling tool and a more accurate solution scheme in tackling directly the issue of nonseparability and deriving the optimal policies analytically for the multi-period mean-variance-type portfolio selection problems.
49 citations