scispace - formally typeset
K

Kawin Ethayarajh

Researcher at Stanford University

Publications -  29
Citations -  1140

Kawin Ethayarajh is an academic researcher from Stanford University. The author has contributed to research in topics: Computer science & Word (computer architecture). The author has an hindex of 10, co-authored 24 publications receiving 687 citations.

Papers
More filters
Proceedings ArticleDOI

How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings

TL;DR: It is found that in all layers of ELMo, BERT, and GPT-2, on average, less than 5% of the variance in a word’s contextualized representations can be explained by a static embedding for that word, providing some justification for the success of contextualization representations.
Proceedings ArticleDOI

Understanding Undesirable Word Embedding Associations.

TL;DR: The authors showed that debiasing vectors post hoc using subspace projection is, under certain conditions, equivalent to training on an unbiased corpus, and they also showed that WEAT, the most common association test for word embeddings, systematically overestimates bias.
Proceedings ArticleDOI

Unsupervised Random Walk Sentence Embeddings: A Strong but Simple Baseline

TL;DR: This paper first shows that word vector length has a confounding effect on the probability of a sentence being generated in Arora et al.
Posted Content

On the Opportunities and Risks of Foundation Models.

Rishi Bommasani, +113 more
- 16 Aug 2021 - 
TL;DR: The authors provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e. g.g. model architectures, training procedures, data, systems, security, evaluation, theory) to their applications.
Proceedings ArticleDOI

Utility is in the Eye of the User: A Critique of NLP Leaderboards

TL;DR: This opinion paper formalizes how leaderboards -- in their current form -- can be poor proxies for the NLP community at large and advocates for more transparency on leaderboards, such as the reporting of statistics that are of practical concern.