scispace - formally typeset
Open AccessProceedings Article

Non-parametric Approximate Dynamic Programming via the Kernel Method

TLDR
A novel non-parametric approximate dynamic programming (ADP) algorithm that enjoys graceful approximation and sample complexity guarantees and can serve as a viable alternative to state-of-the-art parametric ADP algorithms.
Abstract
This paper presents a novel non-parametric approximate dynamic programming (ADP) algorithm that enjoys graceful approximation and sample complexity guarantees. In particular, we establish both theoretically and computationally that our proposal can serve as a viable alternative to state-of-the-art parametric ADP algorithms, freeing the designer from carefully specifying an approximation architecture. We accomplish this by developing a kernel-based mathematical program for ADP. Via a computational study on a controlled queueing network, we show that our procedure is competitive with parametric ADP approaches.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

An Approximate Quadratic Programming for Efficient Bellman Equation Solution

TL;DR: Experimental results on two canonical reinforcement learning scenarios demonstrate that the proposed algorithm achieves similar or better performance than the state-of-the-art algorithms, while reduces the computation time significantly and improves the robustness of the algorithm against state uncertainty.

Risk-Neutral and Risk-Averse Approximate Dynamic Programming Methods

TL;DR: This thesis presents a provably convergent algorithm that exploits the monotone structure of the problem in order to obtain near– optimal policies using a relatively small amount of computation (when compared to exact techniques).
Posted Content

Corporative Stochastic Approximation with Random Constraint Sampling for Semi-Infinite Programming

TL;DR: This work developed a corporative stochastic approximation (CSA) type algorithm for semi-infinite programming (SIP), where the cut generation problem is solved inexactly, and proposes two specific random constraint sampling schemes to approximately solve the cutgeneration problem.
Posted Content

Randomized Primal-Dual Algorithms for Semi-Infinite Programming

Bo Wei, +1 more
TL;DR: A novel algorithm for semi-infinite programming which combines random constraint sampling with the classical primal-dual method is presented, adapted to solve convex optimization problems with a finite (but possibly very large) number constraints and shows that it has the same convergence rates in this case.
Posted Content

Analysis and Optimisation of Bellman Residual Errors with Neural Function Approximation.

TL;DR: In this paper, the authors proposed an approximate Newton's algorithm to minimize the Mean Squared Bellman Error (MSE) function with a residual gradient formulation, which is shown to be locally quadratically convergent to a global minimum numerically.
References
More filters
Book

Dynamic Programming and Optimal Control

TL;DR: The leading and most up-to-date textbook on the far-ranging algorithmic methododogy of Dynamic Programming, which can be used for optimal control, Markovian decision problems, planning and sequential decision making under uncertainty, and discrete/combinatorial optimization.
BookDOI

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

TL;DR: Learning with Kernels provides an introduction to SVMs and related kernel methods that provide all of the concepts necessary to enable a reader equipped with some basic mathematical knowledge to enter the world of machine learning using theoretically well-founded yet easy-to-use kernel algorithms.
Book

Optimization by Vector Space Methods

TL;DR: This book shows engineers how to use optimization theory to solve complex problems with a minimum of mathematics and unifies the large field of optimization with a few geometric principles.
Journal ArticleDOI

Stability properties of constrained queueing systems and scheduling policies for maximum throughput in multihop radio networks

TL;DR: The stability of a queueing network with interdependent servers is considered and a policy is obtained which is optimal in the sense that its Stability Region is a superset of the stability region of every other scheduling policy, and this stability region is characterized.
Book ChapterDOI

Rademacher and gaussian complexities: risk bounds and structural results

TL;DR: In this paper, the authors investigate the use of data-dependent estimates of the complexity of a function class, called Rademacher and Gaussian complexities, in a decision theoretic setting and prove general risk bounds in terms of these complexities.