Explanations can be manipulated and geometry is to blame

Open AccessPosted Content

Explanations can be manipulated and geometry is to blame

- 19 Jun 2019 -

TLDR

It is shown that explanations can be manipulated arbitrarily by applying visually hardly perceptible perturbations to the input that keep the network's output approximately constant, and theoretically this phenomenon can be related to certain geometrical properties of neural networks.

Abstract:

Explanation methods aim to make neural networks more trustworthy and interpretable. In this paper, we demonstrate a property of explanation methods which is disconcerting for both of these purposes. Namely, we show that explanations can be manipulated arbitrarily by applying visually hardly perceptible perturbations to the input that keep the network's output approximately constant. We establish theoretically that this phenomenon can be related to certain geometrical properties of neural networks. This allows us to derive an upper bound on the susceptibility of explanations to manipulations. Based on this result, we propose effective mechanisms to enhance the robustness of explanations.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

A Survey on Explainable Artificial Intelligence (XAI): Toward Medical XAI

Erico Tjoa, +1 more

- 27 Oct 2021 -

IEEE Transactions on Neural Networks

TL;DR: A review on interpretabilities suggested by different research works and categorize them is provided, hoping that insight into interpretability will be born with more considerations for medical practices and initiatives to push forward data-based, mathematically grounded, and technically grounded medical education are encouraged.

...read moreread less

Proceedings ArticleDOI

Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods

Dylan Slack, +4 more

TL;DR: It is demonstrated how extremely biased (racist) classifiers crafted by the proposed framework can easily fool popular explanation techniques such as LIME and SHAP into generating innocuous explanations which do not reflect the underlying biases.

...read moreread less

Journal ArticleDOI

A Unifying Review of Deep and Shallow Anomaly Detection

Lukas Ruff, +7 more

- 24 Sep 2020 -

arXiv: Learning

TL;DR: This review aims to identify the common underlying principles and the assumptions that are often made implicitly by various methods in deep learning, and draws connections between classic “shallow” and novel deep approaches and shows how this relation might cross-fertilize or extend both directions.

...read moreread less

Proceedings ArticleDOI

Explainable machine learning in deployment

Umang Bhatt, +9 more

TL;DR: In this paper, the authors explore how organizations view and use explainability for stakeholder consumption and find that, currently, the majority of deployments are not for end users affected by the model but rather for machine learning engineers who use the explainability to debug the model itself.

...read moreread less

Posted Content

Explainable Machine Learning in Deployment

Umang Bhatt, +9 more

- 13 Sep 2019 -

arXiv: Learning

TL;DR: This study explores how organizations view and use explainability for stakeholder consumption, and synthesizes the limitations of current explainability techniques that hamper their use for end users.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Dissertation

Learning Multiple Layers of Features from Tiny Images

Alex Krizhevsky

TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.

...read moreread less

Collapse

Explanations can be manipulated and geometry is to blame

Citations

A Survey on Explainable Artificial Intelligence (XAI): Toward Medical XAI

Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods

A Unifying Review of Deep and Shallow Anomaly Detection

Explainable machine learning in deployment

Explainable Machine Learning in Deployment

References

Deep Residual Learning for Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet Large Scale Visual Recognition Challenge

Learning Multiple Layers of Features from Tiny Images

Related Papers (5)

"Why Should I Trust You?": Explaining the Predictions of Any Classifier

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

Visualizing and Understanding Convolutional Networks

Learning Deep Features for Discriminative Localization