Home
/
Authors
/
Grusha Prasad

Author

Grusha Prasad

Bio: Grusha Prasad is an academic researcher from Johns Hopkins University. The author has contributed to research in topics: Language model & Sentence. The author has an hindex of 6, co-authored 11 publications receiving 94 citations.

Topics: Language model, Sentence, Syntax, Counterfactual thinking, Grammar ...read more

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Dynabench: Rethinking Benchmarking in NLP.

[...]

Douwe Kiela¹, Max Bartolo², Yixin Nie³, Divyansh Kaushik¹, Atticus Geiger⁴, Zhengxuan Wu⁴, Bertie Vidgen⁵, Grusha Prasad⁶, Amanpreet Singh⁷, Pratik Ringshia⁷, Zhiyi Ma, Tristan Thrush⁸, Sebastian Riedel⁷, Zeerak Waseem⁹, Pontus Stenetorp¹⁰, Robin Jia¹¹, Mohit Bansal³, Christopher Potts⁴, Adina Williams⁷ - Show less +15 more•Institutions (11)

Carnegie Mellon University¹, University College London², University of North Carolina at Chapel Hill³, Stanford University⁴, University of Oxford⁵, Johns Hopkins University⁶, Facebook⁷, Université catholique de Louvain⁸, University of Sheffield⁹, Salesforce.com¹⁰, University of Southern California¹¹

01 Jun 2021

TL;DR: It is argued that Dynabench addresses a critical need in the community: contemporary models quickly achieve outstanding performance on benchmark tasks but nonetheless fail on simple challenge examples and falter in real-world scenarios.

...read moreread less

Abstract: We introduce Dynabench, an open-source platform for dynamic dataset creation and model benchmarking. Dynabench runs in a web browser and supports human-and-model-in-the-loop dataset creation: annotators seek to create examples that a target model will misclassify, but that another person will not. In this paper, we argue that Dynabench addresses a critical need in our community: contemporary models quickly achieve outstanding performance on benchmark tasks but nonetheless fail on simple challenge examples and falter in real-world scenarios. With Dynabench, dataset creation, model development, and model assessment can directly inform each other, leading to more robust and informative benchmarks. We report on four initial NLP tasks, illustrating these concepts and highlighting the promise of the platform, and address potential objections to dynamic benchmarking as a new standard for the field.

...read moreread less

175 citations

Proceedings Article•DOI•

Using Priming to Uncover the Organization of Syntactic Representations in Neural Language Models

[...]

Grusha Prasad¹, Marten van Schijndel², Tal Linzen¹•Institutions (2)

Johns Hopkins University¹, Cornell University²

17 Sep 2019

TL;DR: The authors used the syntactic priming paradigm from psycholinguistics to reconstruct the organization of LSTM LMs' syntactic representational space, showing that LSTMs' representations of different types of sentences with relative clauses are organized hierarchically in a linguistically interpretable manner, suggesting that LMs track abstract properties of the sentence.

...read moreread less

Abstract: Neural language models (LMs) perform well on tasks that require sensitivity to syntactic structure. Drawing on the syntactic priming paradigm from psycholinguistics, we propose a novel technique to analyze the representations that enable such success. By establishing a gradient similarity metric between structures, this technique allows us to reconstruct the organization of the LMs’ syntactic representational space. We use this technique to demonstrate that LSTM LMs’ representations of different types of sentences with relative clauses are organized hierarchically in a linguistically interpretable manner, suggesting that the LMs track abstract properties of the sentence.

...read moreread less

24 citations

Posted Content•

Using Priming to Uncover the Organization of Syntactic Representations in Neural Language Models

[...]

Grusha Prasad¹, Marten van Schijndel², Tal Linzen¹•Institutions (2)

Johns Hopkins University¹, Cornell University²

23 Sep 2019-arXiv: Computation and Language

TL;DR: This work uses a gradient similarity metric to demonstrate that LSTM LMs' representations of different types of sentences with relative clauses are organized hierarchically in a linguistically interpretable manner, suggesting that the LMs track abstract properties of the sentence.

...read moreread less

Abstract: Neural language models (LMs) perform well on tasks that require sensitivity to syntactic structure. Drawing on the syntactic priming paradigm from psycholinguistics, we propose a novel technique to analyze the representations that enable such success. By establishing a gradient similarity metric between structures, this technique allows us to reconstruct the organization of the LMs' syntactic representational space. We use this technique to demonstrate that LSTM LMs' representations of different types of sentences with relative clauses are organized hierarchically in a linguistically interpretable manner, suggesting that the LMs track abstract properties of the sentence.

...read moreread less

19 citations

Posted Content•

Dynabench: Rethinking Benchmarking in NLP

[...]

07 Apr 2021-arXiv: Computation and Language

TL;DR: Dynabench as mentioned in this paper is an open-source platform for dynamic dataset creation and model benchmarking that allows annotators to create examples that a target model will misclassify, but that another person will not.

...read moreread less

14 citations

Journal Article•DOI•

Rapid syntactic adaptation in self-paced reading: Detectable, but only with many participants.

[...]

Grusha Prasad¹, Tal Linzen²•Institutions (2)

Johns Hopkins University¹, New York University²

01 Jul 2021-Journal of Experimental Psychology: Learning, Memory and Cognition

TL;DR: It is concluded that selfpaced reading studies cannot provide unambiguous evidence for rapid syntactic adaptation, and preliminary evidence that the decrease in garden path effect is driven by asymmetric effects of task adaptation is provided.

...read moreread less

Abstract: Temporarily ambiguous sentences that are disambiguated in favor of a less preferred parse are read more slowly than their unambiguous counterparts. This slowdown is referred to as a garden path effect. Recent self-paced reading studies have found that this effect decreased over the course of the experiment as participants were exposed to such syntactically ambiguous sentences. This decrease in the magnitude of the effect has been interpreted as evidence that readers calibrate their expectations to the context; this minimizes their surprise when they encounter these initially unexpected syntactic structures. Such recalibration of syntactic expectations, referred to as syntactic adaptation, is only one possible explanation for the decrease in garden path effect, however; this decrease could also be driven instead by increased familiarity with the self-paced reading paradigm (task adaptation). The goal of this article is to adjudicate between these two explanations. In a large between-group study (n = 642), we find evidence for syntactic adaptation over and above task adaptation. The magnitude of syntactic adaptation compared to task adaptation is very small, however. Power analyses show that a large number of participants is required to detect, with adequate power, syntactic adaptation in future between-subjects self-paced reading studies. This issue is exacerbated in experiments designed to detect modulations of the basic syntactic adaptation effect; such experiments are likely to be underpowered even with more than 1,200 participants. We conclude that while, contrary to recent suggestions, syntactic adaptation can be detected using self-paced reading, this paradigm is not very effective for studying this phenomenon. (PsycInfo Database Record (c) 2021 APA, all rights reserved).

...read moreread less

13 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

A Metaverse: Taxonomy, Components, Applications, and Open Challenges

[...]

01 Jan 2022-IEEE Access

TL;DR: In this article , the authors divide the concepts and essential techniques necessary for realizing the Metaverse into three components (i.e., hardware, software, and contents) rather than marketing or hardware approach to conduct a comprehensive analysis.

...read moreread less

Abstract: Unlike previous studies on the Metaverse based on Second Life, the current Metaverse is based on the social value of Generation Z that online and offline selves are not different. With the technological development of deep learning-based high-precision recognition models and natural generation models, Metaverse is being strengthened with various factors, from mobile-based always-on access to connectivity with reality using virtual currency. The integration of enhanced social activities and neural-net methods requires a new definition of Metaverse suitable for the present, different from the previous Metaverse. This paper divides the concepts and essential techniques necessary for realizing the Metaverse into three components (i.e., hardware, software, and contents) and three approaches (i.e., user interaction, implementation, and application) rather than marketing or hardware approach to conduct a comprehensive analysis. Furthermore, we describe essential methods based on three components and techniques to Metaverse’s representative Ready Player One, Roblox, and Facebook research in the domain of films, games, and studies. Finally, we summarize the limitations and directions for implementing the immersive Metaverse as social influences, constraints, and open challenges.

...read moreread less

313 citations

Journal Article•DOI•

A Metaverse: Taxonomy, Components, Applications, and Open Challenges

[...]

Sang-Min Park, Young-Gab Kim

IEEE Access

TL;DR: This paper divides the concepts and essential techniques necessary for realizing the Metaverse into three components (i.e., hardware, software, and contents) and three approaches and describes essential methods based on three components and techniques to Metaverse’s representative Ready Player One, Roblox, and Facebook research in the domain of films, games, and studies.

...read moreread less

241 citations

Journal Article•DOI•

Holistic Evaluation of Language Models

[...]

Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher R'e, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Byron Rogers, Mirac M. Suzgun, Nathan S. Kim, Neel Guha, Niladri S. Chatterji, Peter Henderson, Qian Huang, Ryan Chi, Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, Yuta Koreeda - Show less +45 more

16 Nov 2022-Annals of the New York Academy of Sciences

TL;DR: The Holistic Evaluation of Language Models (HELM) as mentioned in this paper ) is a popular benchmark for language models, with 30 models evaluated on 16 core scenarios and 7 metrics, exposing important trade-offs.

...read moreread less

Abstract: Language models (LMs) like GPT-3, PaLM, and ChatGPT are the foundation for almost all major language technologies, but their capabilities, limitations, and risks are not well understood. We present Holistic Evaluation of Language Models (HELM) to improve the transparency of LMs. LMs can serve many purposes and their behavior should satisfy many desiderata. To navigate the vast space of potential scenarios and metrics, we taxonomize the space and select representative subsets. We evaluate models on 16 core scenarios and 7 metrics, exposing important trade-offs. We supplement our core evaluation with seven targeted evaluations to deeply analyze specific aspects (including world knowledge, reasoning, regurgitation of copyrighted content, and generation of disinformation). We benchmark 30 LMs, from OpenAI, Microsoft, Google, Meta, Cohere, AI21 Labs, and others. Prior to HELM, models were evaluated on just 17.9% of the core HELM scenarios, with some prominent models not sharing a single scenario in common. We improve this to 96.0%: all 30 models are now benchmarked under the same standardized conditions. Our evaluation surfaces 25 top-level findings. For full transparency, we release all raw model prompts and completions publicly. HELM is a living benchmark for the community, continuously updated with new scenarios, metrics, and models https://crfm.stanford.edu/helm/latest/.

...read moreread less

168 citations

Posted Content•

Hypothesis Only Baselines in Natural Language Inference

[...]

Adam Poliak¹, Jason Naradowsky¹, Aparajita Haldar², Rachel Rudinger¹, Benjamin Van Durme¹ - Show less +1 more•Institutions (2)

Johns Hopkins University¹, Birla Institute of Technology and Science²

02 May 2018-arXiv: Computation and Language

TL;DR: This approach, which is referred to as a hypothesis-only model, is able to significantly outperform a majority-class baseline across a number of NLI datasets and suggests that statistical irregularities may allow a model to perform NLI in some datasets beyond what should be achievable without access to the context.

...read moreread less

Abstract: We propose a hypothesis only baseline for diagnosing Natural Language Inference (NLI). Especially when an NLI dataset assumes inference is occurring based purely on the relationship between a context and a hypothesis, it follows that assessing entailment relations while ignoring the provided context is a degenerate solution. Yet, through experiments on ten distinct NLI datasets, we find that this approach, which we refer to as a hypothesis-only model, is able to significantly outperform a majority class baseline across a number of NLI datasets. Our analysis suggests that statistical irregularities may allow a model to perform NLI in some datasets beyond what should be achievable without access to the context.

...read moreread less

140 citations

Proceedings Article•DOI•

PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts

[...]

Stephen H. Bach, Victor Sanh, Zheng-Xin Yong, Albert Webson, Colin Raffel, Nihal V. Nayak, Abheesht Sharma, Taewoon Kim, M Saiful Bari, Thibault Févry, Zaid Alyafeai, Manan Dey, Andrea Santilli, Zhiqing Sun, Srulik Ben-David, Canwen Xu, Gunjan Chhablani, Han Wang, Jason A. Fries, Maged S. Al-shaibani, Shanya Sharma, Urmish Thakker, Khalid Almubarak, Xiangru Tang, Mike Tian-Jian Jiang, Alexander M. Rush - Show less +22 more

02 Feb 2022

TL;DR: PromptSource addresses the emergent challenges in this new setting with a templating language for defining data-linked prompts, an interface that lets users quickly iterate on prompt development by observing outputs of their prompts on many examples, and a community-driven set of guidelines for contributing new prompts to a common pool.

...read moreread less

Abstract: PromptSource is a system for creating, sharing, and using natural language prompts. Prompts are functions that map an example from a dataset to a natural language input and target output. Using prompts to train and query language models is an emerging area in NLP that requires new tools that let users develop and refine these prompts collaboratively. PromptSource addresses the emergent challenges in this new setting with (1) a templating language for defining data-linked prompts, (2) an interface that lets users quickly iterate on prompt development by observing outputs of their prompts on many examples, and (3) a community-driven set of guidelines for contributing new prompts to a common pool. Over 2,000 prompts for roughly 170 datasets are already available in PromptSource. PromptSource is available at https://github.com/bigscience-workshop/promptsource.

...read moreread less

100 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

Collapse