scispace - formally typeset
I

Ilya Sutskever

Researcher at OpenAI

Publications -  137
Citations -  294374

Ilya Sutskever is an academic researcher from OpenAI. The author has contributed to research in topics: Artificial neural network & Reinforcement learning. The author has an hindex of 75, co-authored 131 publications receiving 235539 citations. Previous affiliations of Ilya Sutskever include Google & University of Toronto.

Papers
More filters
Proceedings Article

One-Shot Imitation Learning

TL;DR: One-shot imitation learning as mentioned in this paper is a meta-learning framework for learning from very few demonstrations of any given task and instantly generalizing to new situations of the same task, without requiring task-specific engineering.
Proceedings Article

Multi-task Sequence to Sequence Learning

TL;DR: The results show that training on a small amount of parsing and image caption data can improve the translation quality between English and German by up to 1.5 BLEU points over strong single-task baselines on the WMT benchmarks, and reveal interesting properties of the two unsupervised learning objectives, autoencoder and skip-thought, in the MTL context.
Posted Content

Learning To Generate Reviews and Discovering Sentiment

TL;DR: The properties of byte-level recurrent language models are explored and a single unit which performs sentiment analysis is found which achieves state of the art on the binary subset of the Stanford Sentiment Treebank.
Proceedings Article

The Recurrent Temporal Restricted Boltzmann Machine

TL;DR: The Recurrent TRBM is introduced, which is a very slight modification of the TRBM for which exact inference is very easy and exact gradient learning is almost tractable.
Proceedings Article

Continuous deep Q-learning with model-based acceleration

TL;DR: This paper derives a continuous variant of the Q-learning algorithm, which it is called normalized advantage functions (NAF), as an alternative to the more commonly used policy gradient and actor-critic methods, and substantially improves performance on a set of simulated robotic control tasks.