Interpreting Deep Learning Models in Natural Language Processing: A Review

Open AccessPosted Content

Interpreting Deep Learning Models in Natural Language Processing: A Review

- 20 Oct 2021 -

TLDR

The authors provide a comprehensive review of various interpretation methods for neural models in NLP, including influence function based methods, KNN-based methods, attention based models, saliency-based models, perturbation-based method, etc.

Abstract:

Neural network models have achieved state-of-the-art performances in a wide range of natural language processing (NLP) tasks. However, a long-standing criticism against neural network models is the lack of interpretability, which not only reduces the reliability of neural NLP systems but also limits the scope of their applications in areas where interpretability is essential (e.g., health care applications). In response, the increasing interest in interpreting neural NLP models has spurred a diverse array of interpretation methods over recent years. In this survey, we provide a comprehensive review of various interpretation methods for neural models in NLP. We first stretch out a high-level taxonomy for interpretation methods in NLP, i.e., training-based approaches, test-based approaches, and hybrid approaches. Next, we describe sub-categories in each category in detail, e.g., influence-function based methods, KNN-based methods, attention-based models, saliency-based methods, perturbation-based methods, etc. We point out deficiencies of current methods and suggest some avenues for future research.

Interpreting Deep Learning Models in Natural Language Processing: A Review

Citations

Triggerless Backdoor Attack for NLP Tasks with Clean Labels

Triggerless Backdoor Attack for NLP Tasks with Clean Labels.

References

Long short-term memory

Attention is All you Need

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Neural Machine Translation by Jointly Learning to Align and Translate

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Related Papers (5)

Interpretability and Analysis in Neural NLP

The NLP Cookbook: Modern Recipes for Transformer Based Deep Learning Architectures

How Transferable are Neural Networks in NLP Applications

Pretrained Language Models for Biomedical and Clinical Tasks: Understanding and Extending the State-of-the-Art.

Improving the Reliability of Deep Neural Networks in NLP: A Review