S
Shipra Agrawal
Researcher at Columbia University
Publications - 83
Citations - 5545
Shipra Agrawal is an academic researcher from Columbia University. The author has contributed to research in topics: Regret & Thompson sampling. The author has an hindex of 31, co-authored 80 publications receiving 4584 citations. Previous affiliations of Shipra Agrawal include Bell Labs & Alcatel-Lucent.
Papers
More filters
Proceedings Article
Analysis of Thompson Sampling for the Multi-armed Bandit Problem
Shipra Agrawal,Navin Goyal +1 more
TL;DR: In this paper, the Thompson sampling algorithm achieves logarithmic expected regret for the stochastic multi-armed bandit problem, where the expected regret is O( lnT + 1 3 ).
Proceedings Article
Thompson Sampling for Contextual Bandits with Linear Payoffs
Shipra Agrawal,Navin Goyal +1 more
TL;DR: In this article, a generalization of Thompson sampling is proposed for the stochastic contextual multi-armed bandit problem with linear payoff functions, where the contexts are provided by an adaptive adversary, and a high probability regret bound of O(d2/e√T1+e) is shown.
Proceedings Article
Further Optimal Regret Bounds for Thompson Sampling
Shipra Agrawal,Navin Goyal +1 more
TL;DR: A novel regret analysis for Thompson Sampling is provided that proves the first near-optimal problem-independent bound of O( √ NT lnT ) on the expected regret of this algorithm, and simultaneously provides the optimal problem-dependent bound.
Journal ArticleDOI
A Dynamic Near-Optimal Algorithm for Online Linear Programming
TL;DR: In this article, a learning-based algorithm is proposed to dynamically update a threshold price vector at geometric time intervals, where the dual prices learned from the revealed columns in the previous period are used to determine the sequential decisions in the current period.
Posted Content
Analysis of Thompson Sampling for the multi-armed bandit problem
Shipra Agrawal,Navin Goyal +1 more
TL;DR: For the first time, it is shown that Thompson Sampling algorithm achieves logarithmic expected regret for the stochastic multi-armed bandit problem.