scispace - formally typeset
A

Aditya Gopalan

Researcher at Indian Institute of Science

Publications -  104
Citations -  1603

Aditya Gopalan is an academic researcher from Indian Institute of Science. The author has contributed to research in topics: Regret & Scheduling (computing). The author has an hindex of 19, co-authored 93 publications receiving 1235 citations. Previous affiliations of Aditya Gopalan include Technion – Israel Institute of Technology & University of Texas at Austin.

Papers
More filters
Proceedings Article

On Kernelized Multi-armed Bandits.

TL;DR: In this article, the authors considered the stochastic bandit problem with a continuous set of arms, with the expected reward function over the arms assumed to be fixed but unknown.
Proceedings Article

Thompson Sampling for Complex Online Problems

TL;DR: It is proved a frequentist regret bound for Thompson sampling in a very general setting involving parameter, action and observation spaces and a likelihood function over them, and improved regret bounds are derived for classes of complex bandit problems involving selecting subsets of arms, including the first nontrivial regret bounds for nonlinear reward feedback from subsets.
Posted Content

Thompson Sampling for Learning Parameterized Markov Decision Processes

TL;DR: It is shown that the number of instants where suboptimal actions are chosen scales logarithmically with time, with high probability, and a frequentist regret bound for priors over general parameter spaces is derived.
Journal ArticleDOI

On Wireless Scheduling With Partial Channel-State Information

TL;DR: In this article, a time-slotted queueing system for a wireless downlink with multiple flows and a single server is considered, with exogenous arrivals and time-varying channels.
Proceedings Article

Thompson Sampling for Learning Parameterized Markov Decision Processes

TL;DR: In this paper, the authors consider reinforcement learning in parameterized Markov Decision Processes (MDPs) and derive a regret bound for priors over general parameter spaces, showing that the number of instants where suboptimal actions are chosen scales logarithmically with time, with high probability.