Explaining NLP Models via Minimal Contrastive Editing (MiCE)

doi:10.18653/V1/2021.FINDINGS-ACL.336

Open AccessProceedings ArticleDOI

Explaining NLP Models via Minimal Contrastive Editing (MiCE)

- pp 3840-3852

About:

This article is published in Meeting of the Association for Computational Linguistics.The article was published on 2021-08-01 and is currently open access. It has received 15 citations till now.

Citations

PDF

Open Access

More filters

Posted Content

Contrastive Explanations for Model Interpretability

Alon Jacovi, +5 more

- 02 Mar 2021 -

arXiv: Computation and Language

TL;DR: This article proposed a method to generate contrastive explanations for classification models by modifying the representation to disregard non-contrastive information, and modifying model behavior to only be based on contrastive reasoning.

...read moreread less

Journal ArticleDOI

Post-hoc Interpretability for Neural NLP: A Survey

- 23 Dec 2022 -

ACM Computing Surveys

TL;DR: A survey of post-hoc interpretability methods for NLP NLP is presented in this article , which provides a categorization of interpretability techniques and how they communicate explanations to humans.

...read moreread less

Journal ArticleDOI

A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives

- 02 Feb 2023 -

ACM Computing Surveys

TL;DR: Recently, contrastive self-supervised training objectives have enabled successes in image representation pretraining by learning to contrast input-input pairs of augmented images as either similar or dissimilar as mentioned in this paper .

...read moreread less

Proceedings ArticleDOI

Explainable Cross-Topic Stance Detection for Search Results

Tim Draws, +6 more

TL;DR: In this article , the authors present a study into the feasibility of using current stance detection approaches to assist users in their web search on debated topics and investigate the quality of stance detection explanations created using different explainability methods and explanation visualization techniques.

...read moreread less

Book ChapterDOI

Towards Generating Counterfactual Examples as Automatic Short Answer Feedback

Anna Filighera, +4 more

- 01 Jan 2022 -

Lecture Notes in Computer Science

TL;DR: In this paper , counterfactual explanation methods are used to generate feedback from short answer grading models automatically for short answer questions, and the results highlight the general weakness of neural networks to adversarial examples and recommend further research with more reliable grading models, for example by including external knowledge sources or training adversarially.

...read moreread less

References

PDF

Open Access

More filters

Posted Content

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Yinhan Liu, +9 more

- 26 Jul 2019 -

arXiv: Computation and Language

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.

...read moreread less

MonographDOI

Causality: models, reasoning, and inference

Judea Pearl

- 14 Sep 2009 -

Tijdschrift Voor Filosofie

TL;DR: The art and science of cause and effect have been studied in the social sciences for a long time as mentioned in this paper, see, e.g., the theory of inferred causation, causal diagrams and the identification of causal effects.

...read moreread less

Proceedings ArticleDOI

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

Marco Tulio Ribeiro, +2 more

TL;DR: In this article, the authors propose LIME, a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem.

...read moreread less

Proceedings Article

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

Karen Simonyan, +2 more

TL;DR: In this paper, the gradient of the class score with respect to the input image is computed to compute a class saliency map, which can be used for weakly supervised object segmentation using classification ConvNets.

...read moreread less

Collapse

Explaining NLP Models via Minimal Contrastive Editing (MiCE)

Citations

Contrastive Explanations for Model Interpretability

Post-hoc Interpretability for Neural NLP: A Survey

A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives

Explainable Cross-Topic Stance Detection for Search Results

Towards Generating Counterfactual Examples as Automatic Short Answer Feedback

References

RoBERTa: A Robustly Optimized BERT Pretraining Approach

Causality: models, reasoning, and inference

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

Transformers: State-of-the-Art Natural Language Processing