Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals

Open AccessPosted Content

Amnesic Probing: Behavioral Explanation with Amnesic Counterfactuals

- 01 Jun 2020 -

TLDR

The inability to infer behavioral conclusions from probing results is pointed out, and an alternative method that focuses on how the information is being used is offered, rather than on what information is encoded is offered.

Abstract:

A growing body of work makes use of probing to investigate the working of neural models, often considered black boxes. Recently, an ongoing debate emerged surrounding the limitations of the probing paradigm. In this work, we point out the inability to infer behavioral conclusions from probing results and offer an alternative method that focuses on how the information is being used, rather than on what information is encoded. Our method, Amnesic Probing, follows the intuition that the utility of a property for a given task can be assessed by measuring the influence of a causal intervention that removes it from the representation. Equipped with this new analysis tool, we can ask questions that were not possible before, e.g. is part-of-speech information important for word prediction? We perform a series of analyses on BERT to answer these types of questions. Our findings demonstrate that conventional probing performance is not correlated to task importance, and we call for increased scrutiny of claims that draw behavioral or causal conclusions from probing results.

Citations

PDF

Open Access

More filters

Proceedings Article

Locating and Editing Factual Associations in GPT

Kevin Meng, +3 more

TL;DR: The authors analyzed the storage and recall of factual associations in autoregressive transformer language models, finding evidence that these associations correspond to localized, directly-editable computations, and proposed a causal intervention for identifying neuron activations that are decisive in a model's factual predictions.

...read moreread less

Journal ArticleDOI

Dissociating language and thought in large language models: a cognitive perspective

Kyle Mahowald, +5 more

- 16 Jan 2023 -

arXiv.org

TL;DR: In this article , the authors review the capabilities of large language models by considering their performance on two different aspects of language use: "formal linguistic competence", which includes knowledge of rules and patterns of a given language, and "functional linguistic competence," a host of cognitive abilities required for language understanding and use in the real world.

...read moreread less

Journal ArticleDOI

Investigating Gender Bias in BERT

Rishabh Bhardwaj, +2 more

- 01 Jul 2021 -

Cognitive Computation

TL;DR: This paper focuses on a popular CLM, BERT, and proposes an algorithm that finds fine-grained gender directions, i.e., one primary direction for each BERT layer, which obviates the need of realizing gender subspace in multiple dimensions and prevents other crucial information from being omitted.

...read moreread less

Journal Article

Locating and Editing Factual Knowledge in GPT

Kevin Meng, +3 more

arXiv.org

TL;DR: An important role is found for mid-layer feed-forward modules in storing factual associations in autoregressive transformer language models and direct manipulation of computational mechanisms may be a feasible approach for model editing.

...read moreread less

Proceedings Article

Linear Adversarial Concept Erasure

Shauli Ravfogel, +3 more

TL;DR: This paper for-mulates the problem of identifying and erasing a linear subspace that corresponds to a given concept in order to prevent linear predictors from recovering the concept, and shows that the method—despite being linear—is highly expressive, effectively mitigating bias in deep nonlinear classiﬁers while maintaining tractability and interpretability.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings Article

Attention is All you Need

Ashish Vaswani, +7 more

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.

...read moreread less

Proceedings ArticleDOI

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Jacob Devlin, +3 more

TL;DR: BERT as mentioned in this paper pre-trains deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

...read moreread less

Posted Content

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Yinhan Liu, +9 more

- 26 Jul 2019 -

arXiv: Computation and Language

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.

...read moreread less

Proceedings ArticleDOI

Deep contextualized word representations

Matthew E. Peters, +6 more

TL;DR: This paper introduced a new type of deep contextualized word representation that models both complex characteristics of word use (e.g., syntax and semantics), and how these uses vary across linguistic contexts (i.e., to model polysemy).

...read moreread less