scispace - formally typeset
W

Wei Chen

Researcher at Microsoft

Publications -  226
Citations -  14625

Wei Chen is an academic researcher from Microsoft. The author has contributed to research in topics: Maximization & Greedy algorithm. The author has an hindex of 47, co-authored 226 publications receiving 12843 citations. Previous affiliations of Wei Chen include University of British Columbia & Stony Brook University.

Papers
More filters
Journal Article

Combinatorial multi-armed bandit and its extension to probabilistically triggered arms

TL;DR: In this article, the authors define a general framework for combinatorial multi-armed bandit (CMAB) problems, where subsets of base arms with unknown distributions form super arms and the reward of the super arm depends on the outcomes of all played arms.
Proceedings Article

Combinatorial Pure Exploration of Multi-Armed Bandits

TL;DR: This paper presents general learning algorithms which work for all decision classes that admit offline maximization oracles in both fixed confidence and fixed budget settings and establishes a general problem-dependent lower bound for the CPE problem.
Proceedings Article

Combinatorial multi-armed bandit: general framework, results and applications

TL;DR: The regret analysis is tight in that it matches the bound for classical MAB problem up to a constant factor, and it significantly improves the regret bound in a recent paper on combinatorial bandits with linear rewards.
Book ChapterDOI

Dual supervised learning

TL;DR: Dual supervised learning as discussed by the authors proposes to train the models of two dual tasks simultaneously, and explicitly exploit the probabilistic correlation between them to regularize the training process, which can improve the practical performances of both tasks.
Posted Content

Combinatorial Multi-Armed Bandit and Its Extension to Probabilistically Triggered Arms

TL;DR: The regret analysis is tight in that it matches the bound of UCB1 algorithm (up to a constant factor) for the classical MAB problem, and it significantly improves the regret bound in an earlier paper on combinatorial bandits with linear rewards.