H
Heewoo Jun
Researcher at Baidu
Publications - 20
Citations - 2438
Heewoo Jun is an academic researcher from Baidu. The author has contributed to research in topics: Language model & Natural language. The author has an hindex of 13, co-authored 17 publications receiving 1211 citations. Previous affiliations of Heewoo Jun include OpenAI.
Papers
More filters
Proceedings Article
Generative Pretraining From Pixels
TL;DR: This work trains a sequence Transformer to auto-regressively predict pixels, without incorporating knowledge of the 2D input structure, and finds that a GPT-2 scale model learns strong image representations as measured by linear probing, fine-tuning, and low-data classification.
Posted Content
Deep Learning Scaling is Predictable, Empirically
Joel Hestness,Sharan Narang,Newsha Ardalani,Gregory Diamos,Heewoo Jun,Hassan Kianinejad,Md. Mostofa Ali Patwary,Yang Yang,Yanqi Zhou +8 more
TL;DR: A large scale empirical characterization of generalization error and model size growth as training sets grow is presented and it is shown that model size scales sublinearly with data size.
Posted Content
Jukebox: A Generative Model for Music
TL;DR: It is shown that the combined model at scale can generate high-fidelity and diverse songs with coherence up to multiple minutes, and can condition on artist and genre to steer the musical and vocal style, and on unaligned lyrics to make the singing more controllable.
Proceedings ArticleDOI
Cold Fusion: Training Seq2Seq Models Together with Language Models.
TL;DR: This article presented the cold fusion method, which leverages a pre-trained language model during training, and showed its effectiveness on the speech recognition task, which is able to better utilize language information enjoying faster convergence and better generalization, and almost complete transfer to a new domain while using less than 10% of the labeled training data.
Posted Content
Scaling Laws for Autoregressive Generative Modeling
Thomas Henighan,Jared Kaplan,Mor Katz,Mark Chen,Christopher Hesse,Jacob Jackson,Heewoo Jun,Tom B. Brown,Prafulla Dhariwal,Scott Gray,Chris Hallacy,Benjamin Mann,Alec Radford,Aditya Ramesh,Nick Ryder,Daniel M. Ziegler,John Schulman,Dario Amodei,Samuel McCandlish +18 more
TL;DR: The case that scaling laws have important implications for neural network performance, including on downstream tasks is strengthened, as empirical scaling laws for the cross-entropy loss are identified.