




Did you find this useful? Give us your feedback
50 citations
16 citations
4 citations
9,873 citations
...This problem has also been identified under slightly different names, such as: the new user problem [2], the cold start problem [3] or new-user ramp-up problem [4]....
[...]
6,361 citations
...The UCB algorithm estimates the value UCBtj for each plan....
[...]
...The UCB gave us a surprisingly good precision and prediction results....
[...]
...UCB [7] consists of selecting the rate-plan that maximises the following function:...
[...]
...The following three MAB algorithms are being used: -greedy [7] aims at picking up the rate-plan that is currently considered the best (i....
[...]
...The multi-armed bandit (MAB) is a classical problem in decision theory [5,6,7]....
[...]
3,883 citations
...This problem has also been identified under slightly different names, such as: the new user problem [2], the cold start problem [3] or new-user ramp-up problem [4]....
[...]
2,370 citations
...The case of EXP3 shows even worse performance than the -greedy....
[...]
...EXP3 [19] selects a rate-plan according to a distribution, which is a mixture of the uniform distribution and a distribution that assigns each plan a probability mass exponential in the estimated cumulative rewards for that plan....
[...]
...Finally, EXP3 selects a plan according to a give distribution, as described in [19]....
[...]
...In this equation, µ̂j favours a greedy selection (exploitation) while the second term √ 2 ln t tj favours exploration driven by uncertainty; it is a confidence interval on the true value of the expectation of reward for plan j. EXP3 [19] selects a rate-plan according to a distribution, which is a mixture of the uniform distribution and a distribution that assigns each plan a probability mass exponential in the estimated cumulative rewards for that plan....
[...]
2,143 citations
The authors would like to extend their gratitude to Professor Helge Langseth at the Department of Computer and Information Science, at the Norwegian University of Science and Technology ( NTNU ), and Dr. Humberto N. Castejón Mart́ınez and Dr. Kenth Engø-Monsen at Telenor Research ; without whom this work would not have been possible.
If the authors use the indicator function as the similarity measurement, then the problem becomes to design an algorithm that predict the rate-plan p∗t chosen by the new user.
In the case of using the correlation value, the rewards of the non-selected rate-plans will be the correlation value between the two vectors p and p∗.
The game of the recommender system is to repeatedly pick up one of the rate-plans and suggest to a new user whenever she enters the system.
If the authors denote the similarity value between the recommended plan pt and the actual demand of the new user ut by a similarity(needt, pt), then the objective when solving the CSAR problem is to select the rate-plans pt that maximizes the following so called ”cumulative reward” (Reward) over all T new users:RewardT = T∑ t=1 (similarity(needt, pt))The CSAR problem would be easy to solve if the authors knew about the user’s needs needt.
To have a better explanation, by looking at the UCB algorithm as described in previous section, the authors see that the recommendation of a rate-plan is a result of solving the trade-off between the average reward and the number of times the plan has been selected so far by users.
To evaluate any algorithm solving this problem, the authors can use the classical precision measurement: PrecisionT = 1 TReward (1) TCorrelation value
The is also true for the second dataset, where a randomly recommended rate-plan only has a 113 = 7.67% probability of being correct.