scispace - formally typeset
Journal ArticleDOI

Bandits With Heavy Tail

Reads0
Chats0
TLDR
This paper examines the bandit problem under the weaker assumption that the distributions have moments of order 1 + ε, and derives matching lower bounds that show that the best achievable regret deteriorates when ε <; 1.
Abstract
The stochastic multiarmed bandit problem is well understood when the reward distributions are sub-Gaussian. In this paper, we examine the bandit problem under the weaker assumption that the distributions have moments of order 1 + e, for some e ∈ (0,1]. Surprisingly, moments of order 2 (i.e., finite variance) are sufficient to obtain regret bounds of the same order as under sub-Gaussian reward distributions. In order to achieve such regret, we define sampling strategies based on refined estimators of the mean such as the truncated empirical mean, Catoni's M-estimator, and the median-of-means estimator. We also derive matching lower bounds that also show that the best achievable regret deteriorates when e <; 1.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book

Regret Analysis of Stochastic and Nonstochastic Multi-Armed Bandit Problems

TL;DR: In this article, the authors focus on regret analysis in the context of multi-armed bandit problems, where regret is defined as the balance between staying with the option that gave highest payoff in the past and exploring new options that might give higher payoffs in the future.
Posted Content

Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates

TL;DR: In this article, Wang et al. developed distributed learning algorithms that are provably robust against Byzantine failures, with a focus on achieving optimal statistical performance, and showed that these algorithms achieve order-optimal statistical error rates for strongly convex losses.
Journal ArticleDOI

Geometric median and robust estimation in Banach spaces

TL;DR: In this article, the geometric median of a collection of independent "weakly concentrated" estimators satisfies a much stronger deviation bound than each individual element in the collection, which is illustrated through several examples, including sparse linear regression and low-rank matrix recovery problems.
Journal ArticleDOI

Geometric median and robust estimation in Banach spaces

Stanislav Minsker
- 01 Nov 2015 - 
TL;DR: In this paper, the geometric median of a collection of independent "weakly concentrated" estimators satisfies a much stronger deviation bound than each individual element in the collection, which is illustrated through several examples, including sparse linear regression and low-rank matrix recovery problems.
Proceedings ArticleDOI

Stochastic bandits robust to adversarial corruptions

TL;DR: In this article, the authors introduce a new model of stochastic bandits with adversarial corruptions, which aims to capture settings where most of the input follows a stochian pattern but some fraction of it can be adversarially changed to trick the algorithm, e.g., click fraud, fake reviews and email spam.
References
More filters
Journal ArticleDOI

Finite-time Analysis of the Multiarmed Bandit Problem

TL;DR: This work shows that the optimal logarithmic regret is also achievable uniformly over time, with simple and efficient policies, and for all reward distributions with bounded support.
Journal ArticleDOI

Robust Estimation of a Location Parameter

TL;DR: In this article, a new approach toward a theory of robust estimation is presented, which treats in detail the asymptotic theory of estimating a location parameter for contaminated normal distributions, and exhibits estimators that are asyptotically most robust (in a sense to be specified) among all translation invariant estimators.
Book

Regret Analysis of Stochastic and Nonstochastic Multi-Armed Bandit Problems

TL;DR: In this article, the authors focus on regret analysis in the context of multi-armed bandit problems, where regret is defined as the balance between staying with the option that gave highest payoff in the past and exploring new options that might give higher payoffs in the future.
Related Papers (5)