A
Alexandre Galashov
Researcher at Google
Publications - 24
Citations - 601
Alexandre Galashov is an academic researcher from Google. The author has contributed to research in topics: Computer science & Reinforcement learning. The author has an hindex of 8, co-authored 20 publications receiving 360 citations. Previous affiliations of Alexandre Galashov include Novosibirsk State University & École Polytechnique.
Papers
More filters
Posted Content
Meta reinforcement learning as task inference
Jan Humplik,Alexandre Galashov,Leonard Hasenclever,Pedro A. Ortega,Yee Whye Teh,Nicolas Heess +5 more
TL;DR: This work proposes a method that separately learns the policy and the task belief by taking advantage of various kinds of privileged information, which can be very effective at solving standard meta-RL environments, as well as a complex continuous control environment with sparse rewards and requiring long-term memory.
Posted Content
Task Agnostic Continual Learning via Meta Learning
TL;DR: This work proposes a framework specific for the scenario where no information about task boundaries or task identity is given, and proposes a separation of concerns into what task is being solved and how the task should be solved, which opens the door to combining meta-learning and continual learning techniques, leveraging their complementary advantages.
Posted Content
Neural probabilistic motor primitives for humanoid control
Josh Merel,Leonard Hasenclever,Alexandre Galashov,Arun Ahuja,Vu Pham,Greg Wayne,Yee Whye Teh,Nicolas Heess +7 more
TL;DR: A motor architecture that has the general structure of an inverse model with a latent-variable bottleneck is proposed, and it is shown that it is possible to train this model entirely offline to compress thousands of expert policies and learn a motor primitive embedding space.
Proceedings Article
Information asymmetry in KL-regularized RL
Alexandre Galashov,Siddhant M. Jayakumar,Leonard Hasenclever,Dhruva Tirumala,Jonathan Schwarz,Guillaume Desjardins,Wojciech Marian Czarnecki,Yee Whye Teh,Razvan Pascanu,Nicolas Heess +9 more
TL;DR: This work starts from the KL regularized expected reward objective and introduces an additional component, a default policy, but crucially restricts the amount of information the default policy receives, forcing it to learn reusable behaviors that help the policy learn faster.
Proceedings Article
Neural Probabilistic Motor Primitives for Humanoid Control
Josh Merel,Leonard Hasenclever,Alexandre Galashov,Arun Ahuja,Vu Pham,Greg Wayne,Yee Whye Teh,Nicolas Heess +7 more
TL;DR: In this paper, the authors propose a motor architecture that has the general structure of an inverse model with a latent-variable bottleneck, and show that it is possible to train this model entirely offline to compress thousands of expert policies.