Alexandre Galashov

Researcher at Google

Publications - 24

Citations - 601

Alexandre Galashov is an academic researcher from Google. The author has contributed to research in topics: Computer science & Reinforcement learning. The author has an hindex of 8, co-authored 20 publications receiving 360 citations. Previous affiliations of Alexandre Galashov include Novosibirsk State University & École Polytechnique.

Papers

PDF

Open Access

More filters

Posted Content

Meta reinforcement learning as task inference

Jan Humplik, +5 more

- 15 May 2019 -

arXiv: Learning

TL;DR: This work proposes a method that separately learns the policy and the task belief by taking advantage of various kinds of privileged information, which can be very effective at solving standard meta-RL environments, as well as a complex continuous control environment with sparse rewards and requiring long-term memory.

...read moreread less

Posted Content

Task Agnostic Continual Learning via Meta Learning

Xu He, +5 more

- 12 Jun 2019 -

arXiv: Machine Learning

TL;DR: This work proposes a framework specific for the scenario where no information about task boundaries or task identity is given, and proposes a separation of concerns into what task is being solved and how the task should be solved, which opens the door to combining meta-learning and continual learning techniques, leveraging their complementary advantages.

...read moreread less

Posted Content

Neural probabilistic motor primitives for humanoid control

Josh Merel, +7 more

- 28 Nov 2018 -

arXiv: Learning

TL;DR: A motor architecture that has the general structure of an inverse model with a latent-variable bottleneck is proposed, and it is shown that it is possible to train this model entirely offline to compress thousands of expert policies and learn a motor primitive embedding space.

...read moreread less

Proceedings Article

Information asymmetry in KL-regularized RL

Alexandre Galashov, +9 more

TL;DR: This work starts from the KL regularized expected reward objective and introduces an additional component, a default policy, but crucially restricts the amount of information the default policy receives, forcing it to learn reusable behaviors that help the policy learn faster.

...read moreread less

Proceedings Article

Neural Probabilistic Motor Primitives for Humanoid Control

Josh Merel, +7 more

TL;DR: In this paper, the authors propose a motor architecture that has the general structure of an inverse model with a latent-variable bottleneck, and show that it is possible to train this model entirely offline to compress thousands of expert policies.

...read moreread less