Reinforcement Learning: An Introduction
Citations
18 citations
Cites background or methods from "Reinforcement Learning: An Introduc..."
...In our experiment, we use UCB-1 [14] and DiscountedUCB [6], both of which are using moving average to estimate...
[...]
...DQN proposed in [12] makes it possible for RL to learn directly from high dimensional inputs, which is a massive step forward from traditional RL [14]....
[...]
...In our experiment, we use UCB-1 [14] and DiscountedUCB [6], both of which are using moving average to estimate the expected performance, and the difference is the latter gives more weight to more recent measurements....
[...]
...For UCB-1 and Discounted-UCB, a critical limitation also separates them from the optimal solution: since they use a straightforward way to estimate the changes of either workload patterns or CDN provider performance (e.g., moving average), they cannot well utilize the historical information....
[...]
18 citations
Cites background from "Reinforcement Learning: An Introduc..."
...INTRODUCTION IN REINFORCEMENT learning (RL) [2], an autonomous agent in a given state selects an action and then transitions to a new state randomly depending on its current state and action, at which point the environment reveals a reward....
[...]
18 citations
18 citations
18 citations
References
72,897 citations
52,797 citations
42,067 citations
33,034 citations
18,802 citations