Top 9 papers published by Vincent Vanhoucke from Google in 2018

Proceedings Article•

QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation

[...]

Dmitry Kalashnikov¹, Alex Irpan¹, Peter Pastor², Julian Ibarz¹, Alexander Herzog³, Eric Jang¹, Deirdre Quillen¹, Ethan Holly¹, Mrinal Kalakrishnan², Vincent Vanhoucke¹, Sergey Levine⁴ - Show less +7 more•Institutions (4)

Google¹, Stanford University², Max Planck Society³, University of California, Berkeley⁴

27 Jun 2018

TL;DR: QT-Opt as mentioned in this paper is a scalable self-supervised vision-based reinforcement learning framework that can leverage over 580k real-world grasp attempts to train a deep neural network Q-function with over 1.2M parameters.

...read moreread less

Abstract: In this paper, we study the problem of learning vision-based dynamic manipulation skills using a scalable reinforcement learning approach. We study this problem in the context of grasping, a longstanding challenge in robotic manipulation. In contrast to static learning behaviors that choose a grasp point and then execute the desired grasp, our method enables closed-loop vision-based control, whereby the robot continuously updates its grasp strategy based on the most recent observations to optimize long-horizon grasp success. To that end, we introduce QT-Opt, a scalable self-supervised vision-based reinforcement learning framework that can leverage over 580k real-world grasp attempts to train a deep neural network Q-function with over 1.2M parameters to perform closed-loop, real-world grasping that generalizes to 96% grasp success on unseen objects. Aside from attaining a very high success rate, our method exhibits behaviors that are quite distinct from more standard grasping systems: using only RGB vision-based perception from an over-the-shoulder camera, our method automatically learns regrasping strategies, probes objects to find the most effective grasps, learns to reposition objects and perform other non-prehensile pre-grasp manipulations, and responds dynamically to disturbances and perturbations.

...read moreread less

884 citations

Proceedings Article•DOI•

Sim-to-Real: Learning Agile Locomotion For Quadruped Robots

[...]

Jie Tan¹, Tingnan Zhang², Erwin Coumans¹, Atil Iscen¹, Yunfei Bai¹, Danijar Hafner¹, Steven Bohez³, Vincent Vanhoucke¹ - Show less +4 more•Institutions (3)

Google¹, Georgia Institute of Technology², Ghent University³

26 Jun 2018

TL;DR: This system can learn quadruped locomotion from scratch using simple reward signals and users can provide an open loop reference to guide the learning process when more control over the learned gait is needed.

...read moreread less

Abstract: Designing agile locomotion for quadruped robots often requires extensive expertise and tedious manual tuning. In this paper, we present a system to automate this process by leveraging deep reinforcement learning techniques. Our system can learn quadruped locomotion from scratch using simple reward signals. In addition, users can provide an open loop reference to guide the learning process when more control over the learned gait is needed. The control policies are learned in a physics simulator and then deployed on real robots. In robotics, policies trained in simulation often do not transfer to the real world. We narrow this reality gap by improving the physics simulator and learning robust policies. We improve the simulation using system identification, developing an accurate actuator model and simulating latency. We learn robust controllers by randomizing the physical environments, adding perturbations and designing a compact observation space. We evaluate our system on two agile locomotion gaits: trotting and galloping. After learning in simulation, a quadruped robot can successfully perform both gaits in the real world.

...read moreread less

520 citations

Proceedings Article•DOI•

Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping

[...]

Konstantinos Bousmalis¹, Alex Irpan¹, Paul Wohlhart¹, Yunfei Bai¹, Matthew Kelcey¹, Mrinal Kalakrishnan², Laura Downs¹, Julian Ibarz¹, Peter Pastor², Kurt Konolige¹, Sergey Levine¹, Vincent Vanhoucke¹ - Show less +8 more•Institutions (2)

Google¹, Stanford University²

21 May 2018

TL;DR: In this paper, the authors study how randomized simulated environments and domain adaptation methods can be extended to train a grasping system to grasp novel objects from raw monocular RGB images, and they extensively evaluate their approaches with a total of more than 25,000 physical test grasps, including a novel extension of pixel-level domain adaptation that they termed the GraspGAN.

...read moreread less

Abstract: Instrumenting and collecting annotated visual grasping datasets to train modern machine learning algorithms can be extremely time-consuming and expensive. An appealing alternative is to use off-the-shelf simulators to render synthetic data for which ground-truth annotations are generated automatically. Unfortunately, models trained purely on simulated data often fail to generalize to the real world. We study how randomized simulated environments and domain adaptation methods can be extended to train a grasping system to grasp novel objects from raw monocular RGB images. We extensively evaluate our approaches with a total of more than 25,000 physical test grasps, studying a range of simulation conditions and domain adaptation methods, including a novel extension of pixel-level domain adaptation that we term the GraspGAN. We show that, by using synthetic data and domain adaptation, we are able to reduce the number of real-world samples needed to achieve a given level of performance by up to 50 times, using only randomly generated simulated objects. We also show that by using only unlabeled real-world data and our GraspGAN methodology, we obtain real-world grasping performance without any real-world labels that is similar to that achieved with 939,777 labeled real-world samples.

...read moreread less

459 citations

Posted Content•

Sim-to-Real: Learning Agile Locomotion For Quadruped Robots

[...]

Jie Tan¹, Tingnan Zhang², Erwin Coumans¹, Atil Iscen¹, Yunfei Bai¹, Danijar Hafner¹, Steven Bohez³, Vincent Vanhoucke¹ - Show less +4 more•Institutions (3)

Google¹, Georgia Institute of Technology², Ghent University³

27 Apr 2018-arXiv: Robotics

TL;DR: In this article, a system is proposed to learn quadruped locomotion from scratch using simple reward signals and users can provide an open loop reference to guide the learning process when more control over the learned gait is needed.

...read moreread less

Abstract: Designing agile locomotion for quadruped robots often requires extensive expertise and tedious manual tuning. In this paper, we present a system to automate this process by leveraging deep reinforcement learning techniques. Our system can learn quadruped locomotion from scratch using simple reward signals. In addition, users can provide an open loop reference to guide the learning process when more control over the learned gait is needed. The control policies are learned in a physics simulator and then deployed on real robots. In robotics, policies trained in simulation often do not transfer to the real world. We narrow this reality gap by improving the physics simulator and learning robust policies. We improve the simulation using system identification, developing an accurate actuator model and simulating latency. We learn robust controllers by randomizing the physical environments, adding perturbations and designing a compact observation space. We evaluate our system on two agile locomotion gaits: trotting and galloping. After learning in simulation, a quadruped robot can successfully perform both gaits in the real world.

...read moreread less

270 citations

QT-Opt: Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation

[...]

Dmitry Kalashnikov, Alex Irpan, Peter Pastor, Julian Ibarz, Alexander Herzog, Eric Jang, Deirdre Quillen, Ethan Holly, Mrinal Kalakrishnan, Vincent Vanhoucke, Sergey Levine - Show less +7 more

27 Jun 2018

TL;DR: QT-Opt as mentioned in this paper is a scalable self-supervised vision-based reinforcement learning framework that can leverage over 580k real-world grasp attempts to train a deep neural network Q-function with over 1.2M parameters.

...read moreread less

Abstract: In this paper, we study the problem of learning vision-based dynamic manipulation skills using a scalable reinforcement learning approach. We study this problem in the context of grasping, a longstanding challenge in robotic manipulation. In contrast to static learning behaviors that choose a grasp point and then execute the desired grasp, our method enables closed-loop vision-based control, whereby the robot continuously updates its grasp strategy based on the most recent observations to optimize long-horizon grasp success. To that end, we introduce QT-Opt, a scalable self-supervised vision-based reinforcement learning framework that can leverage over 580k real-world grasp attempts to train a deep neural network Q-function with over 1.2M parameters to perform closed-loop, real-world grasping that generalizes to 96% grasp success on unseen objects. Aside from attaining a very high success rate, our method exhibits behaviors that are quite distinct from more standard grasping systems: using only RGB vision-based perception from an over-the-shoulder camera, our method automatically learns regrasping strategies, probes objects to find the most effective grasps, learns to reposition objects and perform other non-prehensile pre-grasp manipulations, and responds dynamically to disturbances and perturbations.

...read moreread less

221 citations

Policies Modulating Trajectory Generators

[...]

Atil Iscen, Ken Caluwaerts, Jie Tan, Tingnan Zhang, Erwin Coumans, Vikas Sindhwani, Vincent Vanhoucke - Show less +3 more

23 Oct 2018

TL;DR: It is demonstrated that a simple linear policy, when paired with a parametric Trajectory Generator for quadrupedal gaits, can induce walking behaviors with controllable speed from 4-dimensional IMU observations alone, and can be learned in under 1000 rollouts.

...read moreread less

Abstract: We propose an architecture for learning complex controllable behaviors by having simple Policies Modulate Trajectory Generators (PMTG), a powerful combination that can provide both memory and prior knowledge to the controller. The result is a flexible architecture that is applicable to a class of problems with periodic motion for which one has an insight into the class of trajectories that might lead to a desired behavior. We illustrate the basics of our architecture using a synthetic control problem, then go on to learn speed-controlled locomotion for a quadrupedal robot by using Deep Reinforcement Learning and Evolutionary Strategies. We demonstrate that a simple linear policy, when paired with a parametric Trajectory Generator for quadrupedal gaits, can induce walking behaviors with controllable speed from 4-dimensional IMU observations alone, and can be learned in under 1000 rollouts. We also transfer these policies to a real robot and show locomotion with controllable forward velocity.

...read moreread less

79 citations

Grasp2Vec: Learning Object Representations from Self-Supervised Grasping.

[...]

Eric Jang, Coline Devin, Vincent Vanhoucke, Sergey Levine

23 Oct 2018

TL;DR: In this article, a representation learning approach based on object persistence is proposed to acquire effective object-centric representations for robotic manipulation tasks without human labeling by using autonomous robot interaction with the environment.

...read moreread less

Abstract: Well structured visual representations can make robot learning faster and can improve generalization. In this paper, we study how we can acquire effective object-centric representations for robotic manipulation tasks without human labeling by using autonomous robot interaction with the environment. Such representation learning methods can benefit from continuous refinement of the representation as the robot collects more experience, allowing them to scale effectively without human intervention. Our representation learning approach is based on object persistence: when a robot removes an object from a scene, the representation of that scene should change according to the features of the object that was removed. We formulate an arithmetic relationship between feature vectors from this observation, and use it to learn a representation of scenes and objects that can then be used to identify object instances, localize them in the scene, and perform goal-directed grasping tasks where the robot must retrieve commanded objects from a bin. The same grasping procedure can also be used to automatically collect training data for our method, by recording images of scenes, grasping and removing an object, and recording the outcome. Our experiments demonstrate that this self-supervised approach for tasked grasping substantially outperforms direct reinforcement learning from images and prior representation learning methods.

...read moreread less

78 citations

Journal Article•DOI•

Classification of crystallization outcomes using deep convolutional neural networks.

[...]

Andrew E. Bruno¹, Patrick Charbonneau², Janet Newman³, Edward H. Snell¹, Edward H. Snell⁴, David R. So⁵, Vincent Vanhoucke⁵, Christopher J. Watkins³, Shawn P. Williams⁶, Julie Wilson⁷ - Show less +6 more•Institutions (7)

University at Buffalo¹, Duke University², Commonwealth Scientific and Industrial Research Organisation³, Hauptman-Woodward Medical Research Institute⁴, Google⁵, GlaxoSmithKline⁶, University of York⁷

20 Jun 2018-PLOS ONE

TL;DR: The Machine Recognition of Crystallization Outcomes (MARCO) initiative has assembled roughly half a million annotated images of macromolecular crystallization experiments from various sources and setups.

...read moreread less

Abstract: The Machine Recognition of Crystallization Outcomes (MARCO) initiative has assembled roughly half a million annotated images of macromolecular crystallization experiments from various sources and setups. Here, state-of-the-art machine learning algorithms are trained and tested on different parts of this data set. We find that more than 94% of the test images can be correctly labeled, irrespective of their experimental origin. Because crystal recognition is key to high-density screening and the systematic analysis of crystallization experiments, this approach opens the door to both industrial and fundamental research applications.

...read moreread less

65 citations

Posted Content•

Grasp2Vec: Learning Object Representations from Self-Supervised Grasping.

[...]

Eric Jang, Coline Devin, Vincent Vanhoucke, Sergey Levine

16 Nov 2018-arXiv: Robotics

TL;DR: This paper studies how to acquire effective object-centric representations for robotic manipulation tasks without human labeling by using autonomous robot interaction with the environment using self-supervised methods.

...read moreread less

Abstract: Well structured visual representations can make robot learning faster and can improve generalization. In this paper, we study how we can acquire effective object-centric representations for robotic manipulation tasks without human labeling by using autonomous robot interaction with the environment. Such representation learning methods can benefit from continuous refinement of the representation as the robot collects more experience, allowing them to scale effectively without human intervention. Our representation learning approach is based on object persistence: when a robot removes an object from a scene, the representation of that scene should change according to the features of the object that was removed. We formulate an arithmetic relationship between feature vectors from this observation, and use it to learn a representation of scenes and objects that can then be used to identify object instances, localize them in the scene, and perform goal-directed grasping tasks where the robot must retrieve commanded objects from a bin. The same grasping procedure can also be used to automatically collect training data for our method, by recording images of scenes, grasping and removing an object, and recording the outcome. Our experiments demonstrate that this self-supervised approach for tasked grasping substantially outperforms direct reinforcement learning from images and prior representation learning methods.

...read moreread less

46 citations

Showing papers by "Vincent Vanhoucke published in 2018"