Michal Valko

Researcher at École Normale Supérieure

Publications - 171

Citations - 6266

Michal Valko is an academic researcher from École Normale Supérieure. The author has contributed to research in topics: Regret & Reinforcement learning. The author has an hindex of 26, co-authored 169 publications receiving 3088 citations. Previous affiliations of Michal Valko include University of Pittsburgh & French Institute for Research in Computer Science and Automation.

Papers

PDF

Open Access

More filters

Posted Content

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning

Jean-Bastien Grill, +13 more

- 13 Jun 2020 -

arXiv: Learning

TL;DR: This work introduces Bootstrap Your Own Latent (BYOL), a new approach to self-supervised image representation learning that performs on par or better than the current state of the art on both transfer and semi- supervised benchmarks.

...read moreread less

Proceedings Article

Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning

Jean-Bastien Grill, +13 more

TL;DR: In this article, the authors investigate and provide new insights on the sampling rule called Top-Two Thompson Sampling (TTTS), and justify its use for fixed-confidence best-arm identification.

...read moreread less

Posted Content

Finite-Time Analysis of Kernelised Contextual Bandits

Michal Valko, +4 more

- 26 Sep 2013 -

arXiv: Learning

TL;DR: This work proposes KernelUCB, a kernelised UCB algorithm, and gives a cumulative regret bound through a frequentist analysis and improves the regret bound of GP-UCB for the agnostic case, both in the terms of the kernel-dependent quantity and the RKHS norm of the reward function.

...read moreread less

Journal ArticleDOI

Outlier detection for patient monitoring and alerting

Milos Hauskrecht, +5 more

- 01 Feb 2013 -

Journal of Biomedical Informatics

TL;DR: The hypothesis is that a patient-management decision that is unusual with respect to past patient care may be due to an error and that it is worthwhile to generate an alert if such a decision is encountered, and that the outlier-based alerting can lead to promising true alert rates.

...read moreread less

Proceedings Article

Efficient learning by implicit exploration in bandit problems with side observations

Tomáš Kocák, +3 more

TL;DR: This work proposes the first algorithm that enjoys near-optimal regret guarantees without having to know the observation system before selecting its actions and defines a new partial information setting that models online combinatorial optimization problems where the feedback received by the learner is between semi-bandit and full feedback.

...read moreread less

Collapse