scispace - formally typeset
Open AccessProceedings Article

Measuring Association Between Labels and Free-Text Rationales

TLDR
The authors investigated the effect of free-text natural language rationales on commonsense question-answering and natural language inference tasks and found that T5-based joint models exhibit desirable properties for explaining commonsense questions.
Abstract
In interpretable NLP, we require faithful rationales that reflect the model’s decision-making process for an explained instance. While prior work focuses on extractive rationales (a subset of the input words), we investigate their less-studied counterpart: free-text natural language rationales. We demonstrate that *pipelines*, models for faithful rationalization on information-extraction style tasks, do not work as well on “reasoning” tasks requiring free-text rationales. We turn to models that *jointly* predict and rationalize, a class of widely used high-performance models for free-text rationalization. We investigate the extent to which the labels and rationales predicted by these models are associated, a necessary property of faithful explanation. Via two tests, *robustness equivalence* and *feature importance agreement*, we find that state-of-the-art T5-based joint models exhibit desirable properties for explaining commonsense question-answering and natural language inference, indicating their potential for producing faithful free-text rationales.

read more

Citations
More filters
Posted Content

Teach Me to Explain: A Review of Datasets for Explainable NLP.

TL;DR: The authors identify three predominant classes of explanations (highlights, free-text, and structured) and organize the literature on annotating each type, point to what has been learned to date, and give recommendations for collecting ExNLP datasets in the future.
Posted Content

Learning to Rationalize for Nonmonotonic Reasoning with Distant Supervision

TL;DR: This paper investigates multiple ways to automatically generate rationales using pre-trained language models, neural knowledge models, and distant supervision from related tasks, and trains generative models capable of composing explanatory rationales for unseen instances.
Posted Content

SelfExplain: A Self-Explaining Architecture for Neural Text Classifiers

TL;DR: SelfExplain this paper is a self-explaining framework that explains a text classifier's predictions using phrase-based concepts, which can be used to augment existing neural classifiers by adding a globally interpretable layer that identifies the most influential concepts in the training set for a given sample and a locally interpretable layers that quantifies the contribution of each local input concept by computing a relevance score relative to the predicted label.
Posted Content

Evaluating Explanations for Reading Comprehension with Realistic Counterfactuals.

TL;DR: This paper proposed a methodology to evaluate explanations: an explanation should allow us to understand the RC model's high-level behavior with respect to a set of realistic counterfactual input scenarios, and by connecting explanation techniques' outputs to highlevel model behavior, they can evaluate how useful different explanations really are.
Posted Content

When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data

TL;DR: This paper study the circumstances under which explanations of individual data points can (or cannot) improve modeling performance, and identify properties of datasets for which retrieval-based modeling fails, and suggest that at least one of six preconditions for successful modeling fails to hold with these datasets.
References
More filters
Proceedings Article

Attention is All you Need

TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Proceedings Article

Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank

TL;DR: A Sentiment Treebank that includes fine grained sentiment labels for 215,154 phrases in the parse trees of 11,855 sentences and presents new challenges for sentiment compositionality, and introduces the Recursive Neural Tensor Network.
Proceedings Article

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

TL;DR: In this paper, the gradient of the class score with respect to the input image is computed to compute a class saliency map, which can be used for weakly supervised object segmentation using classification ConvNets.
Proceedings ArticleDOI

Transformers: State-of-the-Art Natural Language Processing

TL;DR: Transformers is an open-source library that consists of carefully engineered state-of-the art Transformer architectures under a unified API and a curated collection of pretrained models made by and available for the community.
Related Papers (5)