Non-parametric Approximate Dynamic Programming via the Kernel Method

Open AccessProceedings Article

Non-parametric Approximate Dynamic Programming via the Kernel Method

Nikhil Bhat, +2 more

- Vol. 25, pp 386-394

Chats0

TLDR

A novel non-parametric approximate dynamic programming (ADP) algorithm that enjoys graceful approximation and sample complexity guarantees and can serve as a viable alternative to state-of-the-art parametric ADP algorithms.

Abstract:

This paper presents a novel non-parametric approximate dynamic programming (ADP) algorithm that enjoys graceful approximation and sample complexity guarantees. In particular, we establish both theoretically and computationally that our proposal can serve as a viable alternative to state-of-the-art parametric ADP algorithms, freeing the designer from carefully specifying an approximation architecture. We accomplish this by developing a kernel-based mathematical program for ADP. Via a computational study on a controlled queueing network, we show that our procedure is competitive with parametric ADP approaches.

Citations

PDF

Open Access

More filters

Posted Content

Q-learning with Nearest Neighbors

Devavrat Shah, +1 more

- 12 Feb 2018 -

arXiv: Learning

TL;DR: In this article, the authors considered a model-free reinforcement learning for infinite-horizon discounted Markov Decision Processes (MDPs) with a continuous state space and unknown transition kernel, and provided tight finite sample analysis of the convergence rate.

...read moreread less

Journal ArticleDOI

Practical kernel-based reinforcement learning

Andre Barreto, +2 more

- 01 Jan 2016 -

Journal of Machine Learning Research

TL;DR: An algorithm that turns KBRL into a practical reinforcement learning tool that significantly outperforms other state-of-the-art reinforcement learning algorithms on the tasks studied and derive upper bounds for the distance between the value functions computed by KBRL and KBSF using the same data.

...read moreread less

Journal ArticleDOI

A comparison of Monte Carlo tree search and rolling horizon optimization for large-scale dynamic resource allocation problems

Dimitris Bertsimas, +4 more

- 01 Dec 2017 -

European Journal of Operational Research

TL;DR: This paper adapt MCTS and RHO to two problems – a problem inspired by tactical wildfire management and a classical problem involving the control of queueing networks – and undertake an extensive computational study comparing the two methods on large scale instances of both problems in terms of both the state and the action spaces.

...read moreread less

Journal ArticleDOI

Multi-period portfolio selection using kernel-based control policy with dimensionality reduction

Yuichi Takano, +1 more

- 01 Jun 2014 -

Expert Systems With Applications

TL;DR: Numerical experiments show that the nonlinear control policy implemented in this paper works not only to reduce the computation time, but also to improve out-of-sample investment performance.

...read moreread less

Journal ArticleDOI

Shape Constraints in Economics and Operations Research

Andrew L. Johnson, +1 more

- 01 Nov 2018 -

Statistical Science

TL;DR: This paper briefly reviews an illustrative set of research utilizing shape constraints in the economics and operations research literature and highlights the methodological innovations and applications with a particular emphasis on utility functions, production economics and sequential decision making applications.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

A Numerical Method for Solving Singular Stochastic Control Problems

Sunil Kumar, +1 more

- 01 Aug 2004 -

Operations Research

TL;DR: A method is proposed that combines finite element methods that numerically solve partial differential equations with a policy update procedure based on the principle of smooth pasting to iteratively solve Hamilton-Jacobi-Bellman equations associated with the stochastic control problem.

...read moreread less

Journal ArticleDOI

Heavy Traffic Convergence of a Controlled, Multiclass Queueing System

L. F. Martins, +2 more

- 01 Nov 1996 -

Siam Journal on Control and Optimization

TL;DR: In this paper, a connection between the optimal sequencing problem for a two-station, two-customer-class queueing network and the problem of control of a multidimensional diffusion process, obtained as a heavy traffic limit of the queueing problem, is provided.

...read moreread less

Proceedings ArticleDOI

Value iteration and optimization of multiclass queueing networks

Rong-Rong Chen, +1 more

TL;DR: It is argued that a natural choice for the initial value function is the value function for the associated deterministic control problem based upon a fluid model, or the approximate solution to Poisson’s equation obtained from the LP of Kumar and Meyn.

...read moreread less

Proceedings Article

Reinforcement Learning using Kernel-Based Stochastic Factorization

Andre Barreto, +2 more

TL;DR: A novel algorithm is introduced to improve the scalability of kernel-based reinforcement-learning by resorting to a special decomposition of a transition matrix, called stochastic factorization, to fix the size of the approximator while at the same time incorporating all the information contained in the data.

...read moreread less

Approximate and Data-Driven Dynamic Programming for Queueing Networks

Ciamac C. Moallemi, +2 more

TL;DR: An approach based on temporal dierence learning to address scheduling problems in complex queueing networks such as those arising in service, communication, and manufacturing systems is developed and extended to a setting where aspects of the queueing network are not modeled and the approach must rely instead on empirical data.

...read moreread less

Collapse

Related Papers (5)

Approximate dynamic programming : solving the curses of dimensionality

Warren B. Powell

The Linear Programming Approach to Approximate Dynamic Programming

Daniela Pucci de Farias, +1 more

- 01 Nov 2003 -

Operations Research

Non-parametric Approximate Dynamic Programming via the Kernel Method

Citations

Q-learning with Nearest Neighbors

Practical kernel-based reinforcement learning

A comparison of Monte Carlo tree search and rolling horizon optimization for large-scale dynamic resource allocation problems

Multi-period portfolio selection using kernel-based control policy with dimensionality reduction

Shape Constraints in Economics and Operations Research

References

A Numerical Method for Solving Singular Stochastic Control Problems

Heavy Traffic Convergence of a Controlled, Multiclass Queueing System

Value iteration and optimization of multiclass queueing networks

Reinforcement Learning using Kernel-Based Stochastic Factorization

Approximate and Data-Driven Dynamic Programming for Queueing Networks

Related Papers (5)

Approximate dynamic programming : solving the curses of dimensionality

The Linear Programming Approach to Approximate Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Dynamic Programming and Optimal Control

Reinforcement Learning: An Introduction