Top 4 papers published by Sergio Guadarrama from Google in 2019

Journal Article•DOI•

The Devil is in the Decoder: Classification, Regression and GANs

[...]

Zbigniew Wojna¹, Vittorio Ferrari², Sergio Guadarrama², Nathan Silberman², Liang-Chieh Chen², Alireza Fathi², Jasper Uijlings² - Show less +3 more•Institutions (2)

University College London¹, Google²

01 Dec 2019-International Journal of Computer Vision

TL;DR: This paper presents an extensive comparison of a variety of decoders for a range of pixel-wise tasks ranging from classification, regression to synthesis and introduces new residual-like connections for decoder.

...read moreread less

Abstract: Many machine vision applications, such as semantic segmentation and depth prediction, require predictions for every pixel of the input image. Models for such problems usually consist of encoders which decrease spatial resolution while learning a high-dimensional representation, followed by decoders who recover the original input resolution and result in low-dimensional predictions. While encoders have been studied rigorously, relatively few studies address the decoder side. This paper presents an extensive comparison of a variety of decoders for a variety of pixel-wise tasks ranging from classification, regression to synthesis. Our contributions are: (1) decoders matter: we observe significant variance in results between different types of decoders on various problems. (2) We introduce new residual-like connections for decoders. (3) We introduce a novel decoder: bilinear additive upsampling. (4) We explore prediction artifacts.

...read moreread less

70 citations

Posted Content•

From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following.

[...]

Justin Fu¹, Anoop Korattikara¹, Sergey Levine¹, Sergio Guadarrama¹•Institutions (1)

Google¹

20 Feb 2019-arXiv: Learning

TL;DR: This work proposes language-conditioned reward learning (LC-RL), which grounds language commands as a reward function represented by a deep neural network, and demonstrates that the model learns rewards that transfer to novel tasks and environments on realistic, high-dimensional visual environments with natural language commands.

...read moreread less

Abstract: Reinforcement learning is a promising framework for solving control problems, but its use in practical situations is hampered by the fact that reward functions are often difficult to engineer. Specifying goals and tasks for autonomous machines, such as robots, is a significant challenge: conventionally, reward functions and goal states have been used to communicate objectives. But people can communicate objectives to each other simply by describing or demonstrating them. How can we build learning algorithms that will allow us to tell machines what we want them to do? In this work, we investigate the problem of grounding language commands as reward functions using inverse reinforcement learning, and argue that language-conditioned rewards are more transferable than language-conditioned policies to new environments. We propose language-conditioned reward learning (LC-RL), which grounds language commands as a reward function represented by a deep neural network. We demonstrate that our model learns rewards that transfer to novel tasks and environments on realistic, high-dimensional visual environments with natural language commands, whereas directly learning a language-conditioned policy leads to poor performance.

...read moreread less

68 citations

Proceedings Article•

From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following

[...]

Justin Fu¹, Anoop Korattikara¹, Sergey Levine², Sergio Guadarrama¹•Institutions (2)

Google¹, University of California, Berkeley²

01 Jan 2019

TL;DR: This article proposed language-conditioned reward learning (LC-RL), which grounds language commands as a reward function represented by a deep neural network, and demonstrated that RL can learn rewards that transfer to novel tasks and environments on realistic, high-dimensional visual environments with natural language commands.

...read moreread less

Abstract: Reinforcement learning is a promising framework for solving control problems, but its use in practical situations is hampered by the fact that reward functions are often difficult to engineer. Specifying goals and tasks for autonomous machines, such as robots, is a significant challenge: conventionally, reward functions and goal states have been used to communicate objectives. But people can communicate objectives to each other simply by describing or demonstrating them. How can we build learning algorithms that will allow us to tell machines what we want them to do? In this work, we investigate the problem of grounding language commands as reward functions using inverse reinforcement learning, and argue that language-conditioned rewards are more transferable than language-conditioned policies to new environments. We propose language-conditioned reward learning (LC-RL), which grounds language commands as a reward function represented by a deep neural network. We demonstrate that our model learns rewards that transfer to novel tasks and environments on realistic, high-dimensional visual environments with natural language commands, whereas directly learning a language-conditioned policy leads to poor performance.

...read moreread less

59 citations

Posted Content•

Measuring the Reliability of Reinforcement Learning Algorithms

[...]

Stephanie C.Y. Chan¹, Samuel Fishman¹, John Canny¹, Anoop Korattikara¹, Sergio Guadarrama¹ - Show less +1 more•Institutions (1)

Google¹

10 Dec 2019-arXiv: Machine Learning

TL;DR: A novel set of metrics that quantitatively measure different aspects of reliability are proposed, designed to be general-purpose and designed complementary statistical tests to enable rigorous comparisons on these metrics.

...read moreread less

Abstract: Lack of reliability is a well-known issue for reinforcement learning (RL) algorithms. This problem has gained increasing attention in recent years, and efforts to improve it have grown substantially. To aid RL researchers and production users with the evaluation and improvement of reliability, we propose a set of metrics that quantitatively measure different aspects of reliability. In this work, we focus on variability and risk, both during training and after learning (on a fixed policy). We designed these metrics to be general-purpose, and we also designed complementary statistical tests to enable rigorous comparisons on these metrics. In this paper, we first describe the desired properties of the metrics and their design, the aspects of reliability that they measure, and their applicability to different scenarios. We then describe the statistical tests and make additional practical recommendations for reporting results. The metrics and accompanying statistical tools have been made available as an open-source library at this https URL. We apply our metrics to a set of common RL algorithms and environments, compare them, and analyze the results.

...read moreread less

7 citations

Showing papers by "Sergio Guadarrama published in 2019"