O
Olivier Pietquin
Researcher at Google
Publications - 248
Citations - 7478
Olivier Pietquin is an academic researcher from Google. The author has contributed to research in topics: Reinforcement learning & Markov decision process. The author has an hindex of 35, co-authored 228 publications receiving 6279 citations. Previous affiliations of Olivier Pietquin include University of Grenoble & university of lille.
Papers
More filters
Posted Content
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards
Matej Vecerík,Todd Hester,Jonathan Scholz,Fumin Wang,Olivier Pietquin,Bilal Piot,Nicolas Heess,Thomas Rothörl,Thomas Lampe,Martin Riedmiller +9 more
TL;DR: A general and model-free approach for Reinforcement Learning on real robotics with sparse rewards built upon the Deep Deterministic Policy Gradient algorithm to use demonstrations that out-performs DDPG, and does not require engineered rewards.
Posted Content
Deep Q-learning from Demonstrations
Todd Hester,Matej Vecerík,Olivier Pietquin,Marc Lanctot,Tom Schaul,Bilal Piot,Dan Horgan,John Quan,Andrew Sendonaris,Gabriel Dulac-Arnold,Ian Osband,John P. Agapiou,Joel Z. Leibo,Audrunas Gruslys +13 more
TL;DR: Deep Q-learning from Demonstrations (DQfD) as mentioned in this paper leverages small sets of demonstration data to massively accelerate the learning process, and is able to automatically assess the necessary ratio of demonstrating data while learning thanks to a prioritized replay mechanism.
Proceedings Article
Noisy Networks For Exploration
Meire Fortunato,Mohammad Gheshlaghi Azar,Bilal Piot,Jacob Menick,Ian Osband,Alex Graves,Vlad Mnih,Rémi Munos,Demis Hassabis,Olivier Pietquin,Charles Blundell,Shane Legg +11 more
TL;DR: It is found that replacing the conventional exploration heuristics for A3C, DQN and dueling agents with NoisyNet yields substantially higher scores for a wide range of Atari games, in some cases advancing the agent from sub to super-human performance.
Proceedings ArticleDOI
GuessWhat?! Visual Object Discovery through Multi-modal Dialogue
TL;DR: This work introduces GuessWhat?
Proceedings Article
Modulating early visual processing by language
TL;DR: In this article, a conditional batch normalization (CBN) is used to modulate convolutional feature maps by a linguistic embedding, leading to the MODulatEd ResNet (MRN) architecture.