S
Sylvain Gelly
Researcher at Google
Publications - 127
Citations - 24381
Sylvain Gelly is an academic researcher from Google. The author has contributed to research in topics: Feature learning & Unsupervised learning. The author has an hindex of 42, co-authored 126 publications receiving 9393 citations. Previous affiliations of Sylvain Gelly include University of Paris-Sud & French Institute for Research in Computer Science and Automation.
Papers
More filters
Posted Content
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy,Lucas Beyer,Alexander Kolesnikov,Dirk Weissenborn,Xiaohua Zhai,Thomas Unterthiner,Mostafa Dehghani,Matthias Minderer,Georg Heigold,Sylvain Gelly,Jakob Uszkoreit,Neil Houlsby +11 more
TL;DR: Vision Transformer (ViT) attains excellent results compared to state-of-the-art convolutional networks while requiring substantially fewer computational resources to train.
Proceedings Article
Parameter-Efficient Transfer Learning for NLP
Neil Houlsby,Andrei Giurgiu,Stanisław Jastrzębski,Bruna Halila Morrone,Quentin de Laroussilhe,Andrea Gesmundo,Mona Attariyan,Sylvain Gelly +7 more
TL;DR: To demonstrate adapter's effectiveness, the recently proposed BERT Transformer model is transferred to 26 diverse text classification tasks, including the GLUE benchmark, and adapter attain near state-of-the-art performance, whilst adding only a few parameters per task.
Proceedings Article
Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations
Francesco Locatello,Stefan Bauer,Mario Lucic,Gunnar Rätsch,Sylvain Gelly,Bernhard Schölkopf,Olivier Bachem +6 more
TL;DR: The authors show that the unsupervised learning of disentangled representations is fundamentally impossible without inductive biases on both the models and the data, and suggest that future work on disentanglement learning should be explicit about the role of inductive bias and (implicit) supervision.
Proceedings Article
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy,Lucas Beyer,Alexander Kolesnikov,Dirk Weissenborn,Xiaohua Zhai,Thomas Unterthiner,Mostafa Dehghani,Matthias Minderer,Georg Heigold,Sylvain Gelly,Jakob Uszkoreit,Neil Houlsby +11 more
TL;DR: The Vision Transformer (ViT) as discussed by the authors uses a pure transformer applied directly to sequences of image patches to perform very well on image classification tasks, achieving state-of-the-art results on ImageNet, CIFAR-100, VTAB, etc.
Posted Content
Big Transfer (BiT): General Visual Representation Learning
Alexander Kolesnikov,Lucas Beyer,Xiaohua Zhai,Joan Puigcerver,Jessica Yung,Sylvain Gelly,Neil Houlsby +6 more
TL;DR: By combining a few carefully selected components, and transferring using a simple heuristic, Big Transfer achieves strong performance on over 20 datasets and performs well across a surprisingly wide range of data regimes -- from 1 example per class to 1M total examples.