ECINN: Efficient Counterfactuals from Invertible Neural Networks.

Home
/
Papers
/
ECINN: Efficient Counterfactuals from Invertible Neural Networks.

Posted Content•

ECINN: Efficient Counterfactuals from Invertible Neural Networks.

Frederik Hvilshøj¹, Alexandros Iosifidis¹, Ira Assent¹•Institutions (1)

25 Mar 2021-arXiv: Learning-

TL;DR: In this paper, a method, ECINN, was proposed to generate counterfactual examples in the time of only two evaluations of the classifier, which was shown to outperform established methods that generate heatmaps.

read less

Abstract: Counterfactual examples identify how inputs can be altered to change the predicted class of a classifier, thus opening up the black-box nature of, e.g., deep neural networks. We propose a method, ECINN, that utilizes the generative capacities of invertible neural networks for image classification to generate counterfactual examples efficiently. In contrast to competing methods that sometimes need a thousand evaluations or more of the classifier, ECINN has a closed-form expression and generates a counterfactual in the time of only two evaluations. Arguably, the main challenge of generating counterfactual examples is to alter only input features that affect the predicted outcome, i.e., class-dependent features. Our experiments demonstrate how ECINN alters class-dependent image regions to change the perceptual and predicted class of the counterfactuals. Additionally, we extend ECINN to also produce heatmaps (ECINNh) for easy inspection of, e.g., pairwise class-dependent changes in the generated counterfactual examples. Experimentally, we find that ECINNh outperforms established methods that generate heatmap-based explanations.

...read moreread less

Citations

PDF

Open Access

More filters

Posted Content•

Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction

[...]

Shauli Ravfogel¹, Grusha Prasad², Tal Linzen³, Yoav Goldberg¹•Institutions (3)

Allen Institute for Artificial Intelligence¹, Johns Hopkins University², New York University³

14 May 2021-arXiv: Computation and Language

TL;DR: This article proposed AlterRep, an intervention-based method to address the question of whether BERT models use their representations of syntax in a manner that is consistent with the grammar of the language, by measuring the change in a model's word prediction behavior when these counterfactual representations are substituted for the original ones.

...read moreread less

Abstract: When language models process syntactically complex sentences, do they use their representations of syntax in a manner that is consistent with the grammar of the language? We propose AlterRep, an intervention-based method to address this question. For any linguistic feature of a given sentence, AlterRep generates counterfactual representations by altering how the feature is encoded, while leaving intact all other aspects of the original representation. By measuring the change in a model's word prediction behavior when these counterfactual representations are substituted for the original ones, we can draw conclusions about the causal effect of the linguistic feature in question on the model's behavior. We apply this method to study how BERT models of different sizes process relative clauses (RCs). We find that BERT variants use RC boundary information during word prediction in a manner that is consistent with the rules of English grammar; this RC boundary information generalizes to a considerable extent across different RC types, suggesting that BERT represents RCs as an abstract linguistic category.

...read moreread less

5 citations

Proceedings Article•

Counterfactual Interventions Reveal the Causal Effect of Relative Clause Representations on Agreement Prediction

[...]

Shauli Ravfogel¹, Grusha Prasad², Tal Linzen³, Yoav Goldberg¹•Institutions (3)

Allen Institute for Artificial Intelligence¹, Johns Hopkins University², New York University³

01 Nov 2021

TL;DR: This paper proposed AlterRep, which generates counterfactual representations by altering how the feature is encoded, while leaving in- tact all other aspects of the original representation. And they applied this method to study how BERT models of different sizes process relative clauses (RCs).

...read moreread less

Abstract: When language models process syntactically complex sentences, do they use their representations of syntax in a manner that is consistent with the grammar of the language? We propose AlterRep, an intervention-based method to address this question. For any linguistic feature of a given sentence, AlterRep generates counterfactual representations by altering how the feature is encoded, while leaving in- tact all other aspects of the original representation. By measuring the change in a model’s word prediction behavior when these counterfactual representations are substituted for the original ones, we can draw conclusions about the causal effect of the linguistic feature in question on the model’s behavior. We apply this method to study how BERT models of different sizes process relative clauses (RCs). We find that BERT variants use RC boundary information during word prediction in a manner that is consistent with the rules of English grammar; this RC boundary information generalizes to a considerable extent across different RC types, suggesting that BERT represents RCs as an abstract linguistic category.

...read moreread less

Posted Content•

On Quantitative Evaluations of Counterfactuals.

[...]

Frederik Hvilshøj¹, Alexandros Iosifidis¹, Ira Assent¹•Institutions (1)

Aarhus University¹

30 Oct 2021-arXiv: Learning

TL;DR: In this article, the authors consolidate the work on evaluating visual counterfactual examples through an analysis and experiments, and propose two new metrics, the Label Variation Score and the Oracle score, which are both less vulnerable to such tiny changes.

...read moreread less

Abstract: As counterfactual examples become increasingly popular for explaining decisions of deep learning models, it is essential to understand what properties quantitative evaluation metrics do capture and equally important what they do not capture. Currently, such understanding is lacking, potentially slowing down scientific progress. In this paper, we consolidate the work on evaluating visual counterfactual examples through an analysis and experiments. We find that while most metrics behave as intended for sufficiently simple datasets, some fail to tell the difference between good and bad counterfactuals when the complexity increases. We observe experimentally that metrics give good scores to tiny adversarial-like changes, wrongly identifying such changes as superior counterfactual examples. To mitigate this issue, we propose two new metrics, the Label Variation Score and the Oracle score, which are both less vulnerable to such tiny changes. We conclude that a proper quantitative evaluation of visual counterfactual examples should combine metrics to ensure that all aspects of good counterfactuals are quantified.

...read moreread less

References

PDF

Open Access

More filters

Posted Content•

Adam: A Method for Stochastic Optimization

[...]

Diederik P. Kingma¹, Jimmy Ba²•Institutions (2)

University of Amsterdam¹, University of Toronto²

22 Dec 2014-arXiv: Learning

TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.

...read moreread less

Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

...read moreread less

23,486 citations

Proceedings Article•DOI•

Rethinking the Inception Architecture for Computer Vision

[...]

Christian Szegedy¹, Vincent Vanhoucke¹, Sergey Ioffe¹, Jonathon Shlens¹, Zbigniew Wojna² - Show less +1 more•Institutions (2)

Google¹, University College London²

27 Jun 2016

TL;DR: In this article, the authors explore ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization.

...read moreread less

Abstract: Convolutional networks are at the core of most state of-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream, yielding substantial gains in various benchmarks. Although increased model size and computational cost tend to translate to immediate quality gains for most tasks (as long as enough labeled data is provided for training), computational efficiency and low parameter count are still enabling factors for various use cases such as mobile vision and big-data scenarios. Here we are exploring ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization. We benchmark our methods on the ILSVRC 2012 classification challenge validation set demonstrate substantial gains over the state of the art: 21:2% top-1 and 5:6% top-5 error for single frame evaluation using a network with a computational cost of 5 billion multiply-adds per inference and with using less than 25 million parameters. With an ensemble of 4 models and multi-crop evaluation, we report 3:5% top-5 error and 17:3% top-1 error on the validation set and 3:6% top-5 error on the official test set.

...read moreread less

16,962 citations

Book Chapter•DOI•

Visualizing and Understanding Convolutional Networks

[...]

Matthew D. Zeiler¹, Rob Fergus¹•Institutions (1)

New York University¹

06 Sep 2014

TL;DR: A novel visualization technique is introduced that gives insight into the function of intermediate feature layers and the operation of the classifier in large Convolutional Network models, used in a diagnostic role to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark.

...read moreread less

Abstract: Large Convolutional Network models have recently demonstrated impressive classification performance on the ImageNet benchmark Krizhevsky et al. [18]. However there is no clear understanding of why they perform so well, or how they might be improved. In this paper we explore both issues. We introduce a novel visualization technique that gives insight into the function of intermediate feature layers and the operation of the classifier. Used in a diagnostic role, these visualizations allow us to find model architectures that outperform Krizhevsky et al on the ImageNet classification benchmark. We also perform an ablation study to discover the performance contribution from different model layers. We show our ImageNet model generalizes well to other datasets: when the softmax classifier is retrained, it convincingly beats the current state-of-the-art results on Caltech-101 and Caltech-256 datasets.

...read moreread less

12,783 citations

Proceedings Article•DOI•

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

[...]

Marco Tulio Ribeiro¹, Sameer Singh¹, Carlos Guestrin¹•Institutions (1)

University of Washington¹

13 Aug 2016

TL;DR: In this article, the authors propose LIME, a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem.

...read moreread less

Abstract: Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally varound the prediction. We also propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). We show the utility of explanations via novel experiments, both simulated and with human subjects, on various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and identifying why a classifier should not be trusted.

...read moreread less

11,104 citations

Proceedings Article•

A unified approach to interpreting model predictions

[...]

Scott M. Lundberg¹, Su-In Lee¹•Institutions (1)

University of Washington¹

04 Dec 2017

TL;DR: In this article, a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations), is presented, which assigns each feature an importance value for a particular prediction.

...read moreread less

Abstract: Understanding why a model makes a certain prediction can be as crucial as the prediction's accuracy in many applications. However, the highest accuracy for large modern datasets is often achieved by complex models that even experts struggle to interpret, such as ensemble or deep learning models, creating a tension between accuracy and interpretability. In response, various methods have recently been proposed to help users interpret the predictions of complex models, but it is often unclear how these methods are related and when one method is preferable over another. To address this problem, we present a unified framework for interpreting predictions, SHAP (SHapley Additive exPlanations). SHAP assigns each feature an importance value for a particular prediction. Its novel components include: (1) the identification of a new class of additive feature importance measures, and (2) theoretical results showing there is a unique solution in this class with a set of desirable properties. The new class unifies six existing methods, notable because several recent methods in the class lack the proposed desirable properties. Based on insights from this unification, we present new methods that show improved computational performance and/or better consistency with human intuition than previous approaches.

...read moreread less

7,309 citations