Open AccessProceedings Article
Optimal Best Arm Identification with Fixed Confidence
Aurélien Garivier,Emilie Kaufmann,Emilie Kaufmann +2 more
- Vol. 49, pp 998-1027
Reads0
Chats0
TLDR
A new, tight lower bound on the sample complexity is proved on the complexity of best-arm identification in one-parameter bandit problems and the `Track-and-Stop' strategy is proposed, which is proved to be asymptotically optimal.Abstract:
We provide a complete characterization of the complexity of best-arm identification in one-parameter bandit problems. We prove a new, tight lower bound on the sample complexity. We propose the 'Track-and-Stop' strategy, which is proved to be asymptotically optimal. It consists in a new sampling rule (which tracks the optimal proportions of arm draws highlighted by the lower bound) and in a stopping rule named after Chernoff, for which we give a new analysis.read more
Citations
More filters
Proceedings Article
Simple Bayesian Algorithms for Best Arm Identification
TL;DR: In this paper, the optimal adaptive allocation of measurement effort for identifying the best among a finite set of options or designs is studied. But the authors focus on the problem of selecting the best design after a small number of measurements.
Posted Content
Mixture Martingales Revisited with Applications to Sequential Tests and Confidence Intervals
Emilie Kaufmann,Wouter M. Koolen +1 more
TL;DR: New deviation inequalities that are valid uniformly in time under adaptive sampling in a multi-armed bandit model are presented, allowing us to analyze stopping rules based on generalized likelihood ratios for a large class of sequential identification problems, and to construct tight confidence intervals for some functions of the means of the arms.
Posted Content
Improving the Expected Improvement Algorithm
TL;DR: In this article, a simple modification of the expected improvement algorithm is proposed, which is asymptotically optimal for Gaussian best-arm identification problems, and provably outperforms standard EI by an order of magnitude.
Proceedings Article
On Explore-Then-Commit strategies
TL;DR: Existing deviation inequalities are refined, which allow us to design fully sequential strategies with finite-time regret guarantees that are asymptotically optimal as the horizon grows and order-optimal in the minimax sense.
Posted Content
On the Optimal Sample Complexity for Best Arm Identification
Lijie Chen,Jian Li +1 more
TL;DR: The first lower bound for BEST-1-ARM is obtained that goes beyond the classic Mannor-Tsitsiklis lower bound, by an interesting reduction from Sign to BEST- 1-ARM.
References
More filters
Journal ArticleDOI
Paper: Modeling by shortest data description
TL;DR: The number of digits it takes to write down an observed sequence x1,...,xN of a time series depends on the model with its parameters that one assumes to have generated the observed data.
Book
Regret Analysis of Stochastic and Nonstochastic Multi-Armed Bandit Problems
TL;DR: In this article, the authors focus on regret analysis in the context of multi-armed bandit problems, where regret is defined as the balance between staying with the option that gave highest payoff in the past and exploring new options that might give higher payoffs in the future.
Journal ArticleDOI
Asymptotically efficient adaptive allocation rules
Tze Leung Lai,Herbert Robbins +1 more
Proceedings Article
Gaussian Process Optimization in the Bandit Setting: No Regret and Experimental Design
TL;DR: This work analyzes GP-UCB, an intuitive upper-confidence based algorithm, and bound its cumulative regret in terms of maximal information gain, establishing a novel connection between GP optimization and experimental design and obtaining explicit sublinear regret bounds for many commonly used covariance functions.
Proceedings Article
Improved Algorithms for Linear Stochastic Bandits
TL;DR: A simple modification of Auer's UCB algorithm achieves with high probability constant regret and improves the regret bound by a logarithmic factor, though experiments show a vast improvement.