scispace - formally typeset
Open AccessProceedings Article

Non-parametric Approximate Dynamic Programming via the Kernel Method

Reads0
Chats0
TLDR
A novel non-parametric approximate dynamic programming (ADP) algorithm that enjoys graceful approximation and sample complexity guarantees and can serve as a viable alternative to state-of-the-art parametric ADP algorithms.
Abstract
This paper presents a novel non-parametric approximate dynamic programming (ADP) algorithm that enjoys graceful approximation and sample complexity guarantees. In particular, we establish both theoretically and computationally that our proposal can serve as a viable alternative to state-of-the-art parametric ADP algorithms, freeing the designer from carefully specifying an approximation architecture. We accomplish this by developing a kernel-based mathematical program for ADP. Via a computational study on a controlled queueing network, we show that our procedure is competitive with parametric ADP approaches.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Q-learning with Nearest Neighbors

TL;DR: In this article, the authors considered a model-free reinforcement learning for infinite-horizon discounted Markov Decision Processes (MDPs) with a continuous state space and unknown transition kernel, and provided tight finite sample analysis of the convergence rate.
Journal ArticleDOI

Practical kernel-based reinforcement learning

TL;DR: An algorithm that turns KBRL into a practical reinforcement learning tool that significantly outperforms other state-of-the-art reinforcement learning algorithms on the tasks studied and derive upper bounds for the distance between the value functions computed by KBRL and KBSF using the same data.
Journal ArticleDOI

A comparison of Monte Carlo tree search and rolling horizon optimization for large-scale dynamic resource allocation problems

TL;DR: This paper adapt MCTS and RHO to two problems – a problem inspired by tactical wildfire management and a classical problem involving the control of queueing networks – and undertake an extensive computational study comparing the two methods on large scale instances of both problems in terms of both the state and the action spaces.
Journal ArticleDOI

Multi-period portfolio selection using kernel-based control policy with dimensionality reduction

TL;DR: Numerical experiments show that the nonlinear control policy implemented in this paper works not only to reduce the computation time, but also to improve out-of-sample investment performance.
Journal ArticleDOI

Shape Constraints in Economics and Operations Research

TL;DR: This paper briefly reviews an illustrative set of research utilizing shape constraints in the economics and operations research literature and highlights the methodological innovations and applications with a particular emphasis on utility functions, production economics and sequential decision making applications.
References
More filters
Journal ArticleDOI

A Numerical Method for Solving Singular Stochastic Control Problems

TL;DR: A method is proposed that combines finite element methods that numerically solve partial differential equations with a policy update procedure based on the principle of smooth pasting to iteratively solve Hamilton-Jacobi-Bellman equations associated with the stochastic control problem.
Journal ArticleDOI

Heavy Traffic Convergence of a Controlled, Multiclass Queueing System

TL;DR: In this paper, a connection between the optimal sequencing problem for a two-station, two-customer-class queueing network and the problem of control of a multidimensional diffusion process, obtained as a heavy traffic limit of the queueing problem, is provided.
Proceedings ArticleDOI

Value iteration and optimization of multiclass queueing networks

TL;DR: It is argued that a natural choice for the initial value function is the value function for the associated deterministic control problem based upon a fluid model, or the approximate solution to Poisson’s equation obtained from the LP of Kumar and Meyn.
Proceedings Article

Reinforcement Learning using Kernel-Based Stochastic Factorization

TL;DR: A novel algorithm is introduced to improve the scalability of kernel-based reinforcement-learning by resorting to a special decomposition of a transition matrix, called stochastic factorization, to fix the size of the approximator while at the same time incorporating all the information contained in the data.

Approximate and Data-Driven Dynamic Programming for Queueing Networks

TL;DR: An approach based on temporal dierence learning to address scheduling problems in complex queueing networks such as those arising in service, communication, and manufacturing systems is developed and extended to a setting where aspects of the queueing network are not modeled and the approach must rely instead on empirical data.