scispace - formally typeset
Open AccessProceedings ArticleDOI

Explaining NLP Models via Minimal Contrastive Editing (MiCE)

About
This article is published in Meeting of the Association for Computational Linguistics.The article was published on 2021-08-01 and is currently open access. It has received 15 citations till now.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Contrastive Explanations for Model Interpretability

TL;DR: This article proposed a method to generate contrastive explanations for classification models by modifying the representation to disregard non-contrastive information, and modifying model behavior to only be based on contrastive reasoning.
Journal ArticleDOI

Post-hoc Interpretability for Neural NLP: A Survey

TL;DR: A survey of post-hoc interpretability methods for NLP NLP is presented in this article , which provides a categorization of interpretability techniques and how they communicate explanations to humans.
Journal ArticleDOI

A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned, and Perspectives

TL;DR: Recently, contrastive self-supervised training objectives have enabled successes in image representation pretraining by learning to contrast input-input pairs of augmented images as either similar or dissimilar as mentioned in this paper .
Proceedings ArticleDOI

Explainable Cross-Topic Stance Detection for Search Results

TL;DR: In this article , the authors present a study into the feasibility of using current stance detection approaches to assist users in their web search on debated topics and investigate the quality of stance detection explanations created using different explainability methods and explanation visualization techniques.
Book ChapterDOI

Towards Generating Counterfactual Examples as Automatic Short Answer Feedback

TL;DR: In this paper , counterfactual explanation methods are used to generate feedback from short answer grading models automatically for short answer questions, and the results highlight the general weakness of neural networks to adversarial examples and recommend further research with more reliable grading models, for example by including external knowledge sources or training adversarially.
References
More filters
Posted Content

RoBERTa: A Robustly Optimized BERT Pretraining Approach

TL;DR: It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.
MonographDOI

Causality: models, reasoning, and inference

TL;DR: The art and science of cause and effect have been studied in the social sciences for a long time as mentioned in this paper, see, e.g., the theory of inferred causation, causal diagrams and the identification of causal effects.
Proceedings ArticleDOI

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

TL;DR: In this article, the authors propose LIME, a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem.
Proceedings Article

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

TL;DR: In this paper, the gradient of the class score with respect to the input image is computed to compute a class saliency map, which can be used for weakly supervised object segmentation using classification ConvNets.
Proceedings ArticleDOI

Transformers: State-of-the-Art Natural Language Processing

TL;DR: Transformers is an open-source library that consists of carefully engineered state-of-the art Transformer architectures under a unified API and a curated collection of pretrained models made by and available for the community.