Topic
Bellman equation
About: Bellman equation is a research topic. Over the lifetime, 5884 publications have been published within this topic receiving 135589 citations.
Papers published on a yearly basis
Papers
More filters
••
TL;DR: In this article, a new optimization formulation of the linear quadratic regulator (LQR) problem via the Lagrangian duality theories was proposed to lay theoretical foundations of potentially effective RL algorithms.
Abstract: Recently, reinforcement learning (RL) is receiving more and more attentions due to its successful demonstrations outperforming human performance in certain challenging tasks. The goal of this paper is to study a new optimization formulation of the linear quadratic regulator (LQR) problem via the Lagrangian duality theories in order to lay theoretical foundations of potentially effective RL algorithms. The new optimization problem includes the Q-function parameters so that it can be directly used to develop Q-learning algorithms, known to be one of the most popular RL algorithms. We prove relations between saddle-points of the Lagrangian function and the optimal solutions of the Bellman equation. As an example of its applications, we propose a model-free primal-dual Q-learning algorithm to solve the LQR problem and demonstrate its validity through examples.
40 citations
••
TL;DR: In this paper, a nonlinear stochastic optimal control of partially observable linear structures is proposed and illustrated with linear building structures equipped with control devices and sensors under horizontal ground acceleration excitation.
40 citations
•
TL;DR: In this article, a unified view of intra-and inter-option model learning is presented, based on a major generalisation of the Bellman equation, which enables compositional planning over many levels of abstraction.
Abstract: In this paper we introduce a framework for option model composition. Option models are temporal abstractions that, like macro-operators in classical planning, jump directly from a start state to an end state. Prior work has focused on constructing option models from primitive actions, by intra-option model learning; or on using option models to construct a value function, by inter-option planning. We present a unified view of intra- and inter-option model learning, based on a major generalisation of the Bellman equation. Our fundamental operation is the recursive composition of option models into other option models. This key idea enables compositional planning over many levels of abstraction. We illustrate our framework using a dynamic programming algorithm that simultaneously constructs optimal option models for multiple subgoals, and also searches over those option models to provide rapid progress towards other subgoals.
40 citations
••
TL;DR: In this article, a characterization of the solution of impulse control problems in terms of superharmonic functions is given, which leads to optimal impulse control strategies and can be seen as the corresponding characterization to the description of the value function for optimal stopping problems as a smallest super-harmonic majorant of the reward function.
40 citations
••
TL;DR: In this article, the authors studied the utility maximization problem for power utility random fields in a semimartingale financial market, with and without intermediate consumption, and introduced an opportunity process as a reduced form of the value process of the resulting stochastic control problem.
Abstract: We study the utility maximization problem for power utility random fields in a semimartingale financial market, with and without intermediate consumption. The notion of an opportunity process is introduced as a reduced form of the value process of the resulting stochastic control problem. We show how the opportunity process describes the key objects: optimal strategy, value function, and dual problem. The results are applied to obtain monotonicity properties of the optimal consumption.
40 citations