S
Samuel McCandlish
Researcher at Stanford University
Publications - 36
Citations - 15716
Samuel McCandlish is an academic researcher from Stanford University. The author has contributed to research in topics: Computer science & Geodesic. The author has an hindex of 17, co-authored 23 publications receiving 4824 citations. Previous affiliations of Samuel McCandlish include OpenAI & Brandeis University.
Papers
More filters
Proceedings Article
Language Models are Few-Shot Learners
Tom B. Brown,Benjamin Mann,Nick Ryder,Melanie Subbiah,Jared Kaplan,Prafulla Dhariwal,Arvind Neelakantan,Pranav Shyam,Girish Sastry,Amanda Askell,Sandhini Agarwal,Ariel Herbert-Voss,Gretchen Krueger,Thomas Henighan,Rewon Child,Aditya Ramesh,Daniel M. Ziegler,Jeffrey Wu,Clemens Winter,Christopher Hesse,Mark Chen,Eric Sigler,Mateusz Litwin,Scott Gray,Benjamin Chess,Jack Clark,Christopher Berner,Samuel McCandlish,Alec Radford,Ilya Sutskever,Dario Amodei +30 more
TL;DR: GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic.
Posted Content
Language Models are Few-Shot Learners
Tom B. Brown,Benjamin Mann,Nick Ryder,Melanie Subbiah,Jared Kaplan,Prafulla Dhariwal,Arvind Neelakantan,Pranav Shyam,Girish Sastry,Amanda Askell,Sandhini Agarwal,Ariel Herbert-Voss,Gretchen Krueger,Thomas Henighan,Rewon Child,Aditya Ramesh,Daniel M. Ziegler,Jeffrey Wu,Clemens Winter,Christopher Hesse,Mark Chen,Eric Sigler,Mateusz Litwin,Scott Gray,Benjamin Chess,Jack Clark,Christopher Berner,Samuel McCandlish,Alec Radford,Ilya Sutskever,Dario Amodei +30 more
TL;DR: This article showed that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches.
Posted Content
Scaling Laws for Neural Language Models
Jared Kaplan,Samuel McCandlish,Thomas Henighan,Tom B. Brown,Benjamin Chess,Rewon Child,Scott Gray,Alec Radford,Jeffrey Wu,Dario Amodei +9 more
TL;DR: Larger models are significantly more sample-efficient, such that optimally compute-efficient training involves training very large models on a relatively modest amount of data and stopping significantly before convergence.
Journal ArticleDOI
Integral geometry and holography
TL;DR: In this article, the authors present a mathematical framework which underlies the connection between information theory and the bulk spacetime in the AdS3/CFT2 correspondence, and explain how basic geometric concepts -points, distances and angles - are reflected in kinematic space, allowing one to reconstruct a large class of spatial bulk geometries from boundary entan-gate entropies.
Journal ArticleDOI
A Stereoscopic Look into the Bulk
TL;DR: In this article, the authors present a dictionary of non-local CFT operators whose duals are simple, diffeomorphism-invariant bulk operators, such as the modular Hamiltonian, which is dual to the fluctuation in the area of a minimal surface.