J
Jared Kaplan
Researcher at Johns Hopkins University
Publications - 109
Citations - 26112
Jared Kaplan is an academic researcher from Johns Hopkins University. The author has contributed to research in topics: AdS/CFT correspondence & Central charge. The author has an hindex of 50, co-authored 109 publications receiving 15020 citations. Previous affiliations of Jared Kaplan include Princeton University & University of California, Davis.
Papers
More filters
Proceedings Article
Language Models are Few-Shot Learners
Tom B. Brown,Benjamin Mann,Nick Ryder,Melanie Subbiah,Jared Kaplan,Prafulla Dhariwal,Arvind Neelakantan,Pranav Shyam,Girish Sastry,Amanda Askell,Sandhini Agarwal,Ariel Herbert-Voss,Gretchen Krueger,Thomas Henighan,Rewon Child,Aditya Ramesh,Daniel M. Ziegler,Jeffrey Wu,Clemens Winter,Christopher Hesse,Mark Chen,Eric Sigler,Mateusz Litwin,Scott Gray,Benjamin Chess,Jack Clark,Christopher Berner,Samuel McCandlish,Alec Radford,Ilya Sutskever,Dario Amodei +30 more
TL;DR: GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.
Posted Content
Language Models are Few-Shot Learners
Tom B. Brown,Benjamin Mann,Nick Ryder,Melanie Subbiah,Jared Kaplan,Prafulla Dhariwal,Arvind Neelakantan,Pranav Shyam,Girish Sastry,Amanda Askell,Sandhini Agarwal,Ariel Herbert-Voss,Gretchen Krueger,Thomas Henighan,Rewon Child,Aditya Ramesh,Daniel M. Ziegler,Jeffrey Wu,Clemens Winter,Christopher Hesse,Mark Chen,Eric Sigler,Mateusz Litwin,Scott Gray,Benjamin Chess,Jack Clark,Christopher Berner,Samuel McCandlish,Alec Radford,Ilya Sutskever,Dario Amodei +30 more
TL;DR: This article showed that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches.
Posted Content
Scaling Laws for Neural Language Models
Jared Kaplan,Samuel McCandlish,Thomas Henighan,Tom B. Brown,Benjamin Chess,Rewon Child,Scott Gray,Alec Radford,Jeffrey Wu,Dario Amodei +9 more
TL;DR: Larger models are significantly more sample-efficient, such that optimally compute-efficient training involves training very large models on a relatively modest amount of data and stopping significantly before convergence.
Journal ArticleDOI
The Effective Field Theory of Inflation
TL;DR: The effective field theory of inflation as discussed by the authors is the most general theory describing the fluctuations around a quasi de Sitter background, in the case of single field models, in which the scalar mode can be eaten by the metric by going to unitary gauge.
Journal ArticleDOI
The Effective Field Theory of Inflation
TL;DR: The effective field theory of inflation as mentioned in this paper is the most general theory describing the fluctuations around a quasi de Sitter background, in the case of single field models, in which the scalar mode can be eaten by the metric by going to unitary gauge.