Open AccessProceedings Article
Non-parametric Approximate Dynamic Programming via the Kernel Method
Nikhil Bhat,Vivek F. Farias,Ciamac C. Moallemi +2 more
- Vol. 25, pp 386-394
TLDR
A novel non-parametric approximate dynamic programming (ADP) algorithm that enjoys graceful approximation and sample complexity guarantees and can serve as a viable alternative to state-of-the-art parametric ADP algorithms.Abstract:
This paper presents a novel non-parametric approximate dynamic programming (ADP) algorithm that enjoys graceful approximation and sample complexity guarantees. In particular, we establish both theoretically and computationally that our proposal can serve as a viable alternative to state-of-the-art parametric ADP algorithms, freeing the designer from carefully specifying an approximation architecture. We accomplish this by developing a kernel-based mathematical program for ADP. Via a computational study on a controlled queueing network, we show that our procedure is competitive with parametric ADP approaches.read more
Citations
More filters
Posted Content
Q-learning with Nearest Neighbors
Devavrat Shah,Qiaomin Xie +1 more
TL;DR: In this article, the authors considered a model-free reinforcement learning for infinite-horizon discounted Markov Decision Processes (MDPs) with a continuous state space and unknown transition kernel, and provided tight finite sample analysis of the convergence rate.
Journal ArticleDOI
Practical kernel-based reinforcement learning
TL;DR: An algorithm that turns KBRL into a practical reinforcement learning tool that significantly outperforms other state-of-the-art reinforcement learning algorithms on the tasks studied and derive upper bounds for the distance between the value functions computed by KBRL and KBSF using the same data.
Journal ArticleDOI
A comparison of Monte Carlo tree search and rolling horizon optimization for large-scale dynamic resource allocation problems
TL;DR: This paper adapt MCTS and RHO to two problems – a problem inspired by tactical wildfire management and a classical problem involving the control of queueing networks – and undertake an extensive computational study comparing the two methods on large scale instances of both problems in terms of both the state and the action spaces.
Journal ArticleDOI
Multi-period portfolio selection using kernel-based control policy with dimensionality reduction
Yuichi Takano,Jun-ya Gotoh +1 more
TL;DR: Numerical experiments show that the nonlinear control policy implemented in this paper works not only to reduce the computation time, but also to improve out-of-sample investment performance.
Journal ArticleDOI
Shape Constraints in Economics and Operations Research
TL;DR: This paper briefly reviews an illustrative set of research utilizing shape constraints in the economics and operations research literature and highlights the methodological innovations and applications with a particular emphasis on utility functions, production economics and sequential decision making applications.
References
More filters
Proceedings Article
Batch Value Function Approximation via Support Vectors
Thomas G. Dietterich,Xin Wang +1 more
TL;DR: Three ways of combining linear programming with the kernel trick to find value function approximations for reinforcement learning are presented, one based on SVM regression; the second is based on the Bellman equation; and the third seeks only to ensure that good moves have an advantage over bad moves.
Journal ArticleDOI
Approximate Dynamic Programming via a Smoothed Linear Program
TL;DR: A novel linear program for the approximation of the dynamic programming cost-to-go function in high-dimensional stochastic control problems, called the “smoothed approximate linear program”, which outperforms the existing LP approach by a substantial margin.
Proceedings Article
Feature Selection Using Regularization in Approximate Linear Programs for Markov Decision Processes
TL;DR: The proposed L1 regularization method can automatically select the appropriate richness of features and its performance does not degrade with an increasing number of features, relying on new and stronger sampling bounds for regularized approximate linear programs.
Journal ArticleDOI
Kernel-based reinforcement learning in average-cost problems
Dirk Ormoneit,Peter W. Glynn +1 more
TL;DR: This work presents a new, kernel-based approach to reinforcement learning which overcomes this difficulty and provably converges to a unique solution and can be shown to be consistent in the sense that its costs converge to the optimal costs asymptotically.
Proceedings Article
Kernel-Based Reinforcement Learning in Average-Cost Problems: An Application to Optimal Portfolio Choice
Dirk Ormoneit,Peter W. Glynn +1 more
TL;DR: In this article, a kernel-based approach to reinforcement learning is presented, which overcomes this difficulty and provably converges to a unique solution in an average-cost framework and on a practical application to the optimal portfolio choice problem.