scispace - formally typeset
Open AccessPosted Content

Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals

TLDR
The inability to infer behavioral conclusions from probing results is pointed out, and an alternative method that focuses on how the information is being used is offered, rather than on what information is encoded is offered.
Abstract
A growing body of work makes use of probing to investigate the working of neural models, often considered black boxes. Recently, an ongoing debate emerged surrounding the limitations of the probing paradigm. In this work, we point out the inability to infer behavioral conclusions from probing results and offer an alternative method that focuses on how the information is being used, rather than on what information is encoded. Our method, Amnesic Probing, follows the intuition that the utility of a property for a given task can be assessed by measuring the influence of a causal intervention that removes it from the representation. Equipped with this new analysis tool, we can ask questions that were not possible before, e.g. is part-of-speech information important for word prediction? We perform a series of analyses on BERT to answer these types of questions. Our findings demonstrate that conventional probing performance is not correlated to task importance, and we call for increased scrutiny of claims that draw behavioral or causal conclusions from probing results.

read more

Citations
More filters
Proceedings Article

Locating and Editing Factual Associations in GPT

TL;DR: The authors analyzed the storage and recall of factual associations in autoregressive transformer language models, finding evidence that these associations correspond to localized, directly-editable computations, and proposed a causal intervention for identifying neuron activations that are decisive in a model's factual predictions.
Journal ArticleDOI

Dissociating language and thought in large language models: a cognitive perspective

TL;DR: In this article , the authors review the capabilities of large language models by considering their performance on two different aspects of language use: "formal linguistic competence", which includes knowledge of rules and patterns of a given language, and "functional linguistic competence," a host of cognitive abilities required for language understanding and use in the real world.
Journal ArticleDOI

Investigating Gender Bias in BERT

TL;DR: This paper focuses on a popular CLM, BERT, and proposes an algorithm that finds fine-grained gender directions, i.e., one primary direction for each BERT layer, which obviates the need of realizing gender subspace in multiple dimensions and prevents other crucial information from being omitted.
Journal Article

Locating and Editing Factual Knowledge in GPT

TL;DR: An important role is found for mid-layer feed-forward modules in storing factual associations in autoregressive transformer language models and direct manipulation of computational mechanisms may be a feasible approach for model editing.
Proceedings Article

Linear Adversarial Concept Erasure

TL;DR: This paper for-mulates the problem of identifying and erasing a linear subspace that corresponds to a given concept in order to prevent linear predictors from recovering the concept, and shows that the method—despite being linear—is highly expressive, effectively mitigating bias in deep nonlinear classifiers while maintaining tractability and interpretability.
References
More filters
Proceedings Article

Attention is All you Need

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Proceedings ArticleDOI

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Posted Content

RoBERTa: A Robustly Optimized BERT Pretraining Approach

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.
Proceedings ArticleDOI

Deep contextualized word representations

TL;DR: This paper introduced a new type of deep contextualized word representation that models both complex characteristics of word use (e.g., syntax and semantics), and how these uses vary across linguistic contexts (i.e., to model polysemy).
Related Papers (5)