scispace - formally typeset
Open AccessJournal ArticleDOI

The Linear Programming Approach to Approximate Dynamic Programming

TLDR
In this article, an efficient method based on linear programming for approximating solutions to large-scale stochastic control problems is proposed. But the approach is not suitable for large scale queueing networks.
Abstract
The curse of dimensionality gives rise to prohibitive computational requirements that render infeasible the exact solution of large-scale stochastic control problems. We study an efficient method based on linear programming for approximating solutions to such problems. The approach "fits" a linear combination of pre-selected basis functions to the dynamic programming cost-to-go function. We develop error bounds that offer performance guarantees and also guide the selection of both basis functions and "state-relevance weights" that influence quality of the approximation. Experimental results in the domain of queueing network control provide empirical support for the methodology.

read more

Content maybe subject to copyright    Report

Citations
More filters

Game theory and AI: a unifled approach to poker games

TL;DR: The result of this examination is that the reduction gained by direct application of model minimization on poker games is bounded and that this bound prevents this method from successfully tackling real-life poker variants.
Posted Content

Large Scale Markov Decision Processes with Changing Rewards

TL;DR: An algorithm is provided that achieves state-of-the-art regret bound of $O( \tilde{O}(\sqrt{T})$ regret bound for large scale MDPs with changing rewards, which to the best of the knowledge is the first.
Posted Content

Deep Primal-Dual Reinforcement Learning: Accelerating Actor-Critic using Bellman Duality.

Woon Sang Cho, +1 more
- 07 Dec 2017 - 
TL;DR: A parameterized Primal-Dual $\pi$ Learning method based on deep neural networks for Markov decision process with large state space and off-policy reinforcement learning that significantly outperforms the one-step temporal-difference actor-critic method.
Proceedings ArticleDOI

Iterated approximate value functions

TL;DR: This paper introduces a control policy which is referred to as the iterated approximate value function policy, which yields a time-varying policy, even in the case where the optimal policy is time-invariant.
Proceedings ArticleDOI

Approximation of constrained average cost Marks

TL;DR: In this article, the authors considered discrete-time constrained Markov control processes (MCPs) under the long-run expected average cost optimality criterion, and proposed a two-step method to numerically approximate the optimal value of this constrained MCPs.
References
More filters
Book

Reinforcement Learning: An Introduction

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Book

Neural networks for pattern recognition

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.
Book

Dynamic Programming and Optimal Control

TL;DR: The leading and most up-to-date textbook on the far-ranging algorithmic methododogy of Dynamic Programming, which can be used for optimal control, Markovian decision problems, planning and sequential decision making under uncertainty, and discrete/combinatorial optimization.
Journal ArticleDOI

Learning to Predict by the Methods of Temporal Differences

Richard S. Sutton
- 01 Aug 1988 - 
TL;DR: This article introduces a class of incremental learning procedures specialized for prediction – that is, for using past experience with an incompletely known system to predict its future behavior – and proves their convergence and optimality for special cases and relation to supervised-learning methods.
Book

Neuro-dynamic programming

TL;DR: This is the first textbook that fully explains the neuro-dynamic programming/reinforcement learning methodology, which is a recent breakthrough in the practical application of neural networks and dynamic programming to complex problems of planning, optimal decision making, and intelligent control.