scispace - formally typeset
H

Heewoo Jun

Researcher at Baidu

Publications -  20
Citations -  2438

Heewoo Jun is an academic researcher from Baidu. The author has contributed to research in topics: Language model & Natural language. The author has an hindex of 13, co-authored 17 publications receiving 1211 citations. Previous affiliations of Heewoo Jun include OpenAI.

Papers
More filters
Proceedings Article

Generative Pretraining From Pixels

TL;DR: This work trains a sequence Transformer to auto-regressively predict pixels, without incorporating knowledge of the 2D input structure, and finds that a GPT-2 scale model learns strong image representations as measured by linear probing, fine-tuning, and low-data classification.
Posted Content

Deep Learning Scaling is Predictable, Empirically

TL;DR: A large scale empirical characterization of generalization error and model size growth as training sets grow is presented and it is shown that model size scales sublinearly with data size.
Posted Content

Jukebox: A Generative Model for Music

TL;DR: It is shown that the combined model at scale can generate high-fidelity and diverse songs with coherence up to multiple minutes, and can condition on artist and genre to steer the musical and vocal style, and on unaligned lyrics to make the singing more controllable.
Proceedings ArticleDOI

Cold Fusion: Training Seq2Seq Models Together with Language Models.

TL;DR: This article presented the cold fusion method, which leverages a pre-trained language model during training, and showed its effectiveness on the speech recognition task, which is able to better utilize language information enjoying faster convergence and better generalization, and almost complete transfer to a new domain while using less than 10% of the labeled training data.
Posted Content

Scaling Laws for Autoregressive Generative Modeling

TL;DR: The case that scaling laws have important implications for neural network performance, including on downstream tasks is strengthened, as empirical scaling laws for the cross-entropy loss are identified.