A
Aditya Gopalan
Researcher at Indian Institute of Science
Publications - 104
Citations - 1603
Aditya Gopalan is an academic researcher from Indian Institute of Science. The author has contributed to research in topics: Regret & Scheduling (computing). The author has an hindex of 19, co-authored 93 publications receiving 1235 citations. Previous affiliations of Aditya Gopalan include Technion – Israel Institute of Technology & University of Texas at Austin.
Papers
More filters
Proceedings Article
On Kernelized Multi-armed Bandits.
TL;DR: In this article, the authors considered the stochastic bandit problem with a continuous set of arms, with the expected reward function over the arms assumed to be fixed but unknown.
Proceedings Article
Thompson Sampling for Complex Online Problems
TL;DR: It is proved a frequentist regret bound for Thompson sampling in a very general setting involving parameter, action and observation spaces and a likelihood function over them, and improved regret bounds are derived for classes of complex bandit problems involving selecting subsets of arms, including the first nontrivial regret bounds for nonlinear reward feedback from subsets.
Posted Content
Thompson Sampling for Learning Parameterized Markov Decision Processes
Aditya Gopalan,Shie Mannor +1 more
TL;DR: It is shown that the number of instants where suboptimal actions are chosen scales logarithmically with time, with high probability, and a frequentist regret bound for priors over general parameter spaces is derived.
Journal ArticleDOI
On Wireless Scheduling With Partial Channel-State Information
TL;DR: In this article, a time-slotted queueing system for a wireless downlink with multiple flows and a single server is considered, with exogenous arrivals and time-varying channels.
Proceedings Article
Thompson Sampling for Learning Parameterized Markov Decision Processes
Aditya Gopalan,Shie Mannor +1 more
TL;DR: In this paper, the authors consider reinforcement learning in parameterized Markov Decision Processes (MDPs) and derive a regret bound for priors over general parameter spaces, showing that the number of instants where suboptimal actions are chosen scales logarithmically with time, with high probability.