scispace - formally typeset
Y

Yishay Mansour

Researcher at Tel Aviv University

Publications -  546
Citations -  30407

Yishay Mansour is an academic researcher from Tel Aviv University. The author has contributed to research in topics: Regret & Upper and lower bounds. The author has an hindex of 80, co-authored 511 publications receiving 26984 citations. Previous affiliations of Yishay Mansour include Technion – Israel Institute of Technology & IBM.

Papers
More filters
Proceedings Article

Policy Gradient Methods for Reinforcement Learning with Function Approximation

TL;DR: This paper proves for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.
Journal ArticleDOI

Constant depth circuits, Fourier transform, and learnability

TL;DR: It is shown that an ACO Boolean function has almost all of its "power spectrum" on the low-order coefficients, implying several new properties of functions in -4C(': Functions in AC() have low "average sensitivity;" they may be approximated well by a real polynomial of low degree and they cannot be pseudorandom function generators.
Journal Article

Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems

TL;DR: A framework that is based on learning the confidence interval around the value function or the Q-function and eliminating actions that are not optimal (with high probability) is described and a model-based and model-free variants of the elimination method are provided.
Proceedings Article

Domain adaptation: Learning bounds and algorithms

TL;DR: Ben-David et al. as discussed by the authors proposed a distance between distributions, discrepancy distance, that is tailored to adaptation problems with arbitrary loss functions, and gave Rademacher complexity bounds for estimating the discrepancy distance from finite samples for different loss functions.
Proceedings Article

A sparse sampling algorithm for near-optimal planning in large Markov decision processes

TL;DR: In this paper, the authors present an algorithm that, given only a generative model (simulator) for an arbitrary MDP, performs near-optimal planning with a running time that has no dependence on the number of states.