A
Alex Ray
Researcher at OpenAI
Publications - 11
Citations - 8928
Alex Ray is an academic researcher from OpenAI. The author has contributed to research in topics: Reinforcement learning & Computer science. The author has an hindex of 8, co-authored 9 publications receiving 4711 citations.
Papers
More filters
Proceedings ArticleDOI
Domain randomization for transferring deep neural networks from simulation to the real world
TL;DR: This paper explores domain randomization, a simple technique for training models on simulated images that transfer to real images by randomizing rendering in the simulator, and achieves the first successful transfer of a deep neural network trained only on simulated RGB images to the real world for the purpose of robotic control.
Proceedings ArticleDOI
Training language models to follow instructions with human feedback
Long Ouyang,Jeffrey Wu,Xu Jiang,Diogo Almeida,Carroll L. Wainwright,Pamela Mishkin,Chong Zhang,Sandhini Agarwal,Katarina Slama,Alex Ray,John Schulman,Jacob Hilton,Fraser Kelton,Luke E. Miller,Maddie Simens,Amanda Askell,Peter Welinder,Paul F. Christiano,Jan Leike,Ryan Lowe +19 more
TL;DR: The results show that fine-tuning with human feedback is a promising direction for aligning language models with human intent and showing improvements in truthfulness and reductions in toxic output generation while having minimal performance regressions on public NLP datasets.
Proceedings Article
Hindsight Experience Replay
Marcin Andrychowicz,Filip Wolski,Alex Ray,Jonas Schneider,Rachel Fong,Peter Welinder,Bob McGrew,Josh Tobin,OpenAI Pieter Abbeel,Wojciech Zaremba +9 more
TL;DR: A novel technique is presented which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering and may be seen as a form of implicit curriculum.
Journal ArticleDOI
Learning dexterous in-hand manipulation:
OpenAI: Marcin Andrychowicz,Bowen Baker,Maciek Chociej,Rafal Jozefowicz,Bob McGrew,Jakub Pachocki,Arthur Petron,Matthias Plappert,Glenn Powell,Alex Ray,Jonas Schneider,Szymon Sidor,Josh Tobin,Peter Welinder,Lilian Weng,Wojciech Zaremba +15 more
TL;DR: This work uses reinforcement learning (RL) to learn dexterous in-hand manipulation policies that can perform vision-based object reorientation on a physical Shadow Dexterous Hand, and these policies transfer to the physical robot despite being trained entirely in simulation.
Posted Content
Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World
TL;DR: In this article, the authors use domain randomization to train a real-world object detector that is accurate to $1.5 cm and robust to distractors and partial occlusions using only data from a simulator with non-realistic random textures.