Open AccessPosted Content
Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop
TLDR
A number of representative studies in each category are reviewed, including systematic manipulation of input to neural networks and the impact on their performance, and testing whether interpretable knowledge can be decoded from intermediate representations acquired by neural networks.Abstract:
The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques specifically developed for analyzing and understanding the inner-workings and representations acquired by neural models of language. Approaches included: systematic manipulation of input to neural networks and investigating the impact on their performance, testing whether interpretable knowledge can be decoded from intermediate representations acquired by neural networks, proposing modifications to neural network architectures to make their knowledge state or generated output more explainable, and examining the performance of networks on simplified or formal languages. Here we review a number of representative studies in each category.read more
Citations
More filters
Density-based clustering based on hierarchical density estimates
TL;DR: In this article, the authors proposed a hierarchical density-based hierarchical clustering method, which provides a clustering hierarchy from which a simplified tree of significant clusters can be constructed, and demonstrated that their approach outperforms the current, state-of-the-art, densitybased clustering methods.
Posted Content
Neural Machine Translation: A Review
TL;DR: This work traces back the origins of modern NMT architectures to word and sentence embeddings and earlier examples of the encoder-decoder network family and concludes with a survey of recent trends in the field.
Journal ArticleDOI
Distributional Semantics and Linguistic Theory
TL;DR: This review provides a critical discussion of the literature on distributional semantics, with an emphasis on methods and results that are of relevance for theoretical linguistics, in three areas: semantic change, polysemy and composition, and the grammar-semantics interface.
Posted Content
Evaluating Recurrent Neural Network Explanations
TL;DR: In this article, several methods have been proposed to explain the predictions of recurrent neural networks (RNNs), in particular of LSTMs, by assigning to each input variable, e.g., a word, a relevance indicating to which extent it contributed to a particular prediction.
Proceedings ArticleDOI
Pareto Probing: Trading Off Accuracy for Complexity
TL;DR: This work argues for a probe metric that reflects the fundamental trade-off between probe complexity and performance: the Pareto hypervolume, and presents a number of parametric and non-parametric metrics to measure complexity.
References
More filters
Journal ArticleDOI
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Journal ArticleDOI
Finding Structure in Time
TL;DR: A proposal along these lines first described by Jordan (1986) which involves the use of recurrent links in order to provide networks with a dynamic memory and suggests a method for representing lexical categories and the type/token distinction is developed.
Proceedings Article
Categorical Reparameterization with Gumbel-Softmax
Eric Jang,Shixiang Gu,Ben Poole +2 more
TL;DR: Gumbel-Softmax as mentioned in this paper replaces the non-differentiable samples from a categorical distribution with a differentiable sample from a novel Gumbel softmax distribution, which has the essential property that it can be smoothly annealed into the categorical distributions.
Journal ArticleDOI
On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation.
Sebastian Bach,Alexander Binder,Grégoire Montavon,Frederick Klauschen,Klaus-Robert Müller,Wojciech Samek +5 more
TL;DR: This work proposes a general solution to the problem of understanding classification decisions by pixel-wise decomposition of nonlinear classifiers by introducing a methodology that allows to visualize the contributions of single pixels to predictions for kernel-based classifiers over Bag of Words features and for multilayered neural networks.