Y
Yishay Mansour
Researcher at Tel Aviv University
Publications - 546
Citations - 30407
Yishay Mansour is an academic researcher from Tel Aviv University. The author has contributed to research in topics: Regret & Upper and lower bounds. The author has an hindex of 80, co-authored 511 publications receiving 26984 citations. Previous affiliations of Yishay Mansour include Technion – Israel Institute of Technology & IBM.
Papers
More filters
Proceedings Article
Policy Gradient Methods for Reinforcement Learning with Function Approximation
TL;DR: This paper proves for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.
Journal ArticleDOI
Constant depth circuits, Fourier transform, and learnability
TL;DR: It is shown that an ACO Boolean function has almost all of its "power spectrum" on the low-order coefficients, implying several new properties of functions in -4C(': Functions in AC() have low "average sensitivity;" they may be approximated well by a real polynomial of low degree and they cannot be pseudorandom function generators.
Journal Article
Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems
TL;DR: A framework that is based on learning the confidence interval around the value function or the Q-function and eliminating actions that are not optimal (with high probability) is described and a model-based and model-free variants of the elimination method are provided.
Proceedings Article
Domain adaptation: Learning bounds and algorithms
TL;DR: Ben-David et al. as discussed by the authors proposed a distance between distributions, discrepancy distance, that is tailored to adaptation problems with arbitrary loss functions, and gave Rademacher complexity bounds for estimating the discrepancy distance from finite samples for different loss functions.
Proceedings Article
A sparse sampling algorithm for near-optimal planning in large Markov decision processes
TL;DR: In this paper, the authors present an algorithm that, given only a generative model (simulator) for an arbitrary MDP, performs near-optimal planning with a running time that has no dependence on the number of states.