scispace - formally typeset
S

Sébastien Bubeck

Researcher at Microsoft

Publications -  184
Citations -  13538

Sébastien Bubeck is an academic researcher from Microsoft. The author has contributed to research in topics: Regret & Upper and lower bounds. The author has an hindex of 45, co-authored 169 publications receiving 11076 citations. Previous affiliations of Sébastien Bubeck include Max Planck Society & Laboratoire d'Informatique Fondamentale de Lille.

Papers
More filters
Book

Regret Analysis of Stochastic and Nonstochastic Multi-Armed Bandit Problems

TL;DR: In this article, the authors focus on regret analysis in the context of multi-armed bandit problems, where regret is defined as the balance between staying with the option that gave highest payoff in the past and exploring new options that might give higher payoffs in the future.
Book

Convex Optimization: Algorithms and Complexity

TL;DR: This monograph presents the main complexity theorems in convex optimization and their corresponding algorithms and provides a gentle introduction to structural optimization with FISTA, saddle-point mirror prox, Nemirovski's alternative to Nesterov's smoothing, and a concise description of interior point methods.
Proceedings Article

Best Arm Identification in Multi-Armed Bandits

TL;DR: In this paper, the regret of a forecaster is defined by the gap between the mean reward of the optimal arm and the ultimately chosen arm, and the regret decreases exponentially at a rate which is, up to a logarithmic factor, the best possible.
Book ChapterDOI

Pure exploration in multi-armed bandits problems

TL;DR: The main result is that the required exploration-exploitation trade-offs are qualitatively different, in view of a general lower bound on the simple regret in terms of the cumulative regret.
Proceedings Article

Minimax policies for adversarial and stochastic bandits

TL;DR: This work fills in a long open gap in the characterization of the minimax rate for the multi-armed bandit prob- lem and proposes a new family of randomized algorithms based on an implicit normalization, as well as a new analysis.