M
Michal Valko
Researcher at École Normale Supérieure
Publications - 171
Citations - 6266
Michal Valko is an academic researcher from École Normale Supérieure. The author has contributed to research in topics: Regret & Reinforcement learning. The author has an hindex of 26, co-authored 169 publications receiving 3088 citations. Previous affiliations of Michal Valko include University of Pittsburgh & French Institute for Research in Computer Science and Automation.
Papers
More filters
Posted Content
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Jean-Bastien Grill,Florian Strub,Florent Altché,Corentin Tallec,Pierre H. Richemond,Elena Buchatskaya,Carl Doersch,Bernardo Avila Pires,Zhaohan Daniel Guo,Mohammad Gheshlaghi Azar,Bilal Piot,Koray Kavukcuoglu,Rémi Munos,Michal Valko +13 more
TL;DR: This work introduces Bootstrap Your Own Latent (BYOL), a new approach to self-supervised image representation learning that performs on par or better than the current state of the art on both transfer and semi- supervised benchmarks.
Proceedings Article
Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning
Jean-Bastien Grill,Florian Strub,Florent Altché,Corentin Tallec,Pierre H. Richemond,Elena Buchatskaya,Carl Doersch,Bernardo Avila Pires,Zhaohan Daniel Guo,Mohammad Gheshlaghi Azar,Bilal Piot,Koray Kavukcuoglu,Rémi Munos,Michal Valko +13 more
TL;DR: In this article, the authors investigate and provide new insights on the sampling rule called Top-Two Thompson Sampling (TTTS), and justify its use for fixed-confidence best-arm identification.
Posted Content
Finite-Time Analysis of Kernelised Contextual Bandits
TL;DR: This work proposes KernelUCB, a kernelised UCB algorithm, and gives a cumulative regret bound through a frequentist analysis and improves the regret bound of GP-UCB for the agnostic case, both in the terms of the kernel-dependent quantity and the RKHS norm of the reward function.
Journal ArticleDOI
Outlier detection for patient monitoring and alerting
Milos Hauskrecht,Iyad Batal,Michal Valko,Shyam Visweswaran,Gregory F. Cooper,Gilles Clermont +5 more
TL;DR: The hypothesis is that a patient-management decision that is unusual with respect to past patient care may be due to an error and that it is worthwhile to generate an alert if such a decision is encountered, and that the outlier-based alerting can lead to promising true alert rates.
Proceedings Article
Efficient learning by implicit exploration in bandit problems with side observations
TL;DR: This work proposes the first algorithm that enjoys near-optimal regret guarantees without having to know the observation system before selecting its actions and defines a new partial information setting that models online combinatorial optimization problems where the feedback received by the learner is between semi-bandit and full feedback.