scispace - formally typeset
A

Alexandre Galashov

Researcher at Google

Publications -  24
Citations -  601

Alexandre Galashov is an academic researcher from Google. The author has contributed to research in topics: Computer science & Reinforcement learning. The author has an hindex of 8, co-authored 20 publications receiving 360 citations. Previous affiliations of Alexandre Galashov include Novosibirsk State University & École Polytechnique.

Papers
More filters
Posted Content

Meta reinforcement learning as task inference

TL;DR: This work proposes a method that separately learns the policy and the task belief by taking advantage of various kinds of privileged information, which can be very effective at solving standard meta-RL environments, as well as a complex continuous control environment with sparse rewards and requiring long-term memory.
Posted Content

Task Agnostic Continual Learning via Meta Learning

TL;DR: This work proposes a framework specific for the scenario where no information about task boundaries or task identity is given, and proposes a separation of concerns into what task is being solved and how the task should be solved, which opens the door to combining meta-learning and continual learning techniques, leveraging their complementary advantages.
Posted Content

Neural probabilistic motor primitives for humanoid control

TL;DR: A motor architecture that has the general structure of an inverse model with a latent-variable bottleneck is proposed, and it is shown that it is possible to train this model entirely offline to compress thousands of expert policies and learn a motor primitive embedding space.
Proceedings Article

Information asymmetry in KL-regularized RL

TL;DR: This work starts from the KL regularized expected reward objective and introduces an additional component, a default policy, but crucially restricts the amount of information the default policy receives, forcing it to learn reusable behaviors that help the policy learn faster.
Proceedings Article

Neural Probabilistic Motor Primitives for Humanoid Control

TL;DR: In this paper, the authors propose a motor architecture that has the general structure of an inverse model with a latent-variable bottleneck, and show that it is possible to train this model entirely offline to compress thousands of expert policies.