scispace - formally typeset
Search or ask a question
Institution

Massachusetts Institute of Technology

EducationCambridge, Massachusetts, United States
About: Massachusetts Institute of Technology is a education organization based out in Cambridge, Massachusetts, United States. It is known for research contribution in the topics: Population & Laser. The organization has 116795 authors who have published 268000 publications receiving 18272025 citations. The organization is also known as: MIT & M.I.T..


Papers
More filters
Posted Content
TL;DR: This article showed that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches.
Abstract: Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic in architecture, this method still requires task-specific fine-tuning datasets of thousands or tens of thousands of examples. By contrast, humans can generally perform a new language task from only a few examples or from simple instructions - something which current NLP systems still largely struggle to do. Here we show that scaling up language models greatly improves task-agnostic, few-shot performance, sometimes even reaching competitiveness with prior state-of-the-art fine-tuning approaches. Specifically, we train GPT-3, an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model. GPT-3 achieves strong performance on many NLP datasets, including translation, question-answering, and cloze tasks, as well as several tasks that require on-the-fly reasoning or domain adaptation, such as unscrambling words, using a novel word in a sentence, or performing 3-digit arithmetic. At the same time, we also identify some datasets where GPT-3's few-shot learning still struggles, as well as some datasets where GPT-3 faces methodological issues related to training on large web corpora. Finally, we find that GPT-3 can generate samples of news articles which human evaluators have difficulty distinguishing from articles written by humans. We discuss broader societal impacts of this finding and of GPT-3 in general.

1,886 citations

Journal ArticleDOI
10 Jan 1997-Science
TL;DR: In this paper, the Raman spectra of single wall carbon nanotubes (SWNTs) were studied using laser excitation wavelengths in the range from 514.5 to 1320 nanometers.
Abstract: Single wall carbon nanotubes (SWNTs) that are found as close-packed arrays in crystalline ropes have been studied by using Raman scattering techniques with laser excitation wavelengths in the range from 514.5 to 1320 nanometers. Numerous Raman peaks were observed and identified with vibrational modes of armchair symmetry (n, n) SWNTs. The Raman spectra are in good agreement with lattice dynamics calculations based on C-C force constants used to fit the two-dimensional, experimental phonon dispersion of a single graphene sheet. Calculated intensities from a nonresonant, bond polarizability model optimized for sp2 carbon are also in qualitative agreement with the Raman data, although a resonant Raman scattering process is also taking place. This resonance results from the one-dimensional quantum confinement of the electrons in the nanotube.

1,882 citations

Journal ArticleDOI
01 Nov 2012-Genetics
TL;DR: A suite of methods for learning about population mixtures are presented, implemented in a software package called ADMIXTOOLS, that support formal tests for whether mixture occurred and make it possible to infer proportions and dates of mixture.
Abstract: Population mixture is an important process in biology. We present a suite of methods for learning about population mixtures, implemented in a software package called ADMIXTOOLS, that support formal tests for whether mixture occurred and make it possible to infer proportions and dates of mixture. We also describe the development of a new single nucleotide polymorphism (SNP) array consisting of 629,433 sites with clearly documented ascertainment that was specifically designed for population genetic analyses and that we genotyped in 934 individuals from 53 diverse populations. To illustrate the methods, we give a number of examples that provide new insights about the history of human admixture. The most striking finding is a clear signal of admixture into northern Europe, with one ancestral population related to present-day Basques and Sardinians and the other related to present-day populations of northeast Asia and the Americas. This likely reflects a history of admixture between Neolithic migrants and the indigenous Mesolithic population of Europe, consistent with recent analyses of ancient bones from Sweden and the sequencing of the genome of the Tyrolean "Iceman."

1,877 citations

Journal ArticleDOI
TL;DR: The controller updates considered here are event-driven, depending on the ratio of a certain measurement error with respect to the norm of a function of the state, and are applied to a first order agreement problem.
Abstract: Event-driven strategies for multi-agent systems are motivated by the future use of embedded microprocessors with limited resources that will gather information and actuate the individual agent controller updates. The controller updates considered here are event-driven, depending on the ratio of a certain measurement error with respect to the norm of a function of the state, and are applied to a first order agreement problem. A centralized formulation is considered first and then its distributed counterpart, in which agents require knowledge only of their neighbors' states for the controller implementation. The results are then extended to a self-triggered setup, where each agent computes its next update time at the previous one, without having to keep track of the state error that triggers the actuation between two consecutive update instants. The results are illustrated through simulation examples.

1,876 citations

Journal ArticleDOI
TL;DR: In this paper, it was shown that, at least in the upper tail, all cities follow some proportional growth process (this appears to be verified empirically), which automatically leads their distribution to converge to Zipf's law.
Abstract: Zipf ’s law is a very tight constraint on the class of admissible models of local growth. It says that for most countries the size distribution of cities strikingly fits a power law: the number of cities with populations greater than S is proportional to 1/S. Suppose that, at least in the upper tail, all cities follow some proportional growth process (this appears to be verified empirically). This automatically leads their distribution to converge to Zipf ’s law.

1,875 citations


Authors

Showing all 117442 results

NameH-indexPapersCitations
Eric S. Lander301826525976
Robert Langer2812324326306
George M. Whitesides2401739269833
Trevor W. Robbins2311137164437
George Davey Smith2242540248373
Yi Cui2201015199725
Robert J. Lefkowitz214860147995
David J. Hunter2131836207050
Daniel Levy212933194778
Rudolf Jaenisch206606178436
Mark J. Daly204763304452
David Miller2032573204840
David Baltimore203876162955
Rakesh K. Jain2001467177727
Ronald M. Evans199708166722
Network Information
Related Institutions (5)
University of California, Berkeley
265.6K papers, 16.8M citations

96% related

Stanford University
320.3K papers, 21.8M citations

95% related

University of Illinois at Urbana–Champaign
225.1K papers, 10.1M citations

95% related

University of California, San Diego
204.5K papers, 12.3M citations

95% related

Columbia University
224K papers, 12.8M citations

94% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
2023240
20221,124
202110,595
202011,922
201911,207
201810,883