scispace - formally typeset
Search or ask a question

Showing papers by "Rob Fergus published in 2020"


Posted Content
TL;DR: The addition of the augmentation method dramatically improves SAC's performance, enabling it to reach state-of-the-art performance on the DeepMind control suite, surpassing model-based methods and recently proposed contrastive learning (CURL).
Abstract: We propose a simple data augmentation technique that can be applied to standard model-free reinforcement learning algorithms, enabling robust learning directly from pixels without the need for auxiliary losses or pre-training. The approach leverages input perturbations commonly used in computer vision tasks to regularize the value function. Existing model-free approaches, such as Soft Actor-Critic (SAC), are not able to train deep networks effectively from image pixels. However, the addition of our augmentation method dramatically improves SAC's performance, enabling it to reach state-of-the-art performance on the DeepMind control suite, surpassing model-based (Dreamer, PlaNet, and SLAC) methods and recently proposed contrastive learning (CURL). Our approach can be combined with any model-free reinforcement learning algorithm, requiring only minor modifications. An implementation can be found at this https URL.

395 citations


Posted Content
TL;DR: This paper compares three approaches for automatically finding an appropriate augmentation and shows that their agent outperforms other baselines specifically designed to improve generalization in RL and learns policies and representations that are more robust to changes in the environment that do not affect the agent.
Abstract: Deep reinforcement learning (RL) agents often fail to generalize to unseen scenarios, even when they are trained on many instances of semantically similar environments Data augmentation has recently been shown to improve the sample efficiency and generalization of RL agents However, different tasks tend to benefit from different kinds of data augmentation In this paper, we compare three approaches for automatically finding an appropriate augmentation These are combined with two novel regularization terms for the policy and value function, required to make the use of data augmentation theoretically sound for certain actor-critic algorithms We evaluate our methods on the Procgen benchmark which consists of 16 procedurally-generated environments and show that it improves test performance by ~40% relative to standard RL algorithms Our agent outperforms other baselines specifically designed to improve generalization in RL In addition, we show that our agent learns policies and representations that are more robust to changes in the environment that do not affect the agent, such as the background Our implementation is available at this https URL

76 citations


Posted Content
TL;DR: An investigation of the model’s outputs and hidden representations find that it captures physicochemical properties relevant to protein energy.
Abstract: We propose an energy-based model (EBM) of protein conformations that operates at atomic scale. The model is trained solely on crystallized protein data. By contrast, existing approaches for scoring conformations use energy functions that incorporate knowledge of physical principles and features that are the complex product of several decades of research and tuning. To evaluate the model, we benchmark on the rotamer recovery task, the problem of predicting the conformation of a side chain from its context within a protein structure, which has been used to evaluate energy functions for protein design. The model achieves performance close to that of the Rosetta energy function, a state-of-the-art method widely used in protein structure prediction and design. An investigation of the model's outputs and hidden representations finds that it captures physicochemical properties relevant to protein energy.

30 citations


Proceedings Article
30 Apr 2020
TL;DR: In this paper, an energy-based model (EBM) of protein conformations that operates at atomic scale is proposed. But the model is trained solely on crystallized protein data.
Abstract: We propose an energy-based model (EBM) of protein conformations that operates at atomic scale. The model is trained solely on crystallized protein data. By contrast, existing approaches for scoring conformations use energy functions that incorporate knowledge of physical principles and features that are the complex product of several decades of research and tuning. To evaluate our model, we benchmark on the rotamer recovery task, a restricted problem setting used to evaluate energy functions for protein design. Our model achieves comparable performance to the Rosetta energy function, a state-of-the-art method widely used in protein structure prediction and design. An investigation of the model’s outputs and hidden representations find that it captures physicochemical properties relevant to protein energy.

17 citations


Posted Content
TL;DR: Policy-Dynamics Value Functions (PD-VF) is introduced, a novel approach for rapidly adapting to dynamics different from those previously seen in training that can rapidly adapt to new dynamics on a set of MuJoCo domains.
Abstract: Standard RL algorithms assume fixed environment dynamics and require a significant amount of interaction to adapt to new environments. We introduce Policy-Dynamics Value Functions (PD-VF), a novel approach for rapidly adapting to dynamics different from those previously seen in training. PD-VF explicitly estimates the cumulative reward in a space of policies and environments. An ensemble of conventional RL policies is used to gather experience on training environments, from which embeddings of both policies and environments can be learned. Then, a value function conditioned on both embeddings is trained. At test time, a few actions are sufficient to infer the environment embedding, enabling a policy to be selected by maximizing the learned value function (which requires no additional environment interaction). We show that our method can rapidly adapt to new dynamics on a set of MuJoCo domains. Code available at this https URL.

10 citations