Axiom-based Grad-CAM: Towards Accurate Visualization and Explanation of CNNs

Open AccessPosted Content

Axiom-based Grad-CAM: Towards Accurate Visualization and Explanation of CNNs

Ruigang Fu, +5 more

- 05 Aug 2020 -

arXiv: Computer Vision and Pattern Recog...

Chats0

TLDR

This paper introduces two axioms -- Conservation and Sensitivity -- to the visualization paradigm of the CAM methods and proposes a dedicated Axiom-based Grad-CAM (XGrad-Cam) that is able to achieve better visualization performance and be class-discriminative and easy-to-implement compared with Grad-cAM++ and Ablation-C AM.

Abstract:

To have a better understanding and usage of Convolution Neural Networks (CNNs), the visualization and interpretation of CNNs has attracted increasing attention in recent years. In particular, several Class Activation Mapping (CAM) methods have been proposed to discover the connection between CNN's decision and image regions. In spite of the reasonable visualization, lack of clear and sufficient theoretical support is the main limitation of these methods. In this paper, we introduce two axioms -- Conservation and Sensitivity -- to the visualization paradigm of the CAM methods. Meanwhile, a dedicated Axiom-based Grad-CAM (XGrad-CAM) is proposed to satisfy these axioms as much as possible. Experiments demonstrate that XGrad-CAM is an enhanced version of Grad-CAM in terms of conservation and sensitivity. It is able to achieve better visualization performance than Grad-CAM, while also be class-discriminative and easy-to-implement compared with Grad-CAM++ and Ablation-CAM. The code is available at this https URL.

Citations

PDF

Open Access

More filters

Posted Content

Explaining Convolutional Neural Networks through Attribution-Based Input Sampling and Block-Wise Feature Aggregation

Sam Sattarzadeh, +9 more

- 01 Oct 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work collects visualization maps from multiple layers of the model based on an attribution-based input sampling technique and aggregate them to reach a fine-grained and complete explanation, and proposes a layer selection strategy that applies to the whole family of CNN-based models.

...read moreread less

Journal ArticleDOI

A Cascade Attention Based Facial Expression Recognition Network by Fusing Multi-Scale Spatio-Temporal Features

Xiaoliang Zhu, +4 more

- 01 Feb 2022 -

Sensors

TL;DR: This paper proposes a cascade attention-based facial expression recognition network on the basis of a combination of (i) local spatial feature, (ii) multi-scale-stereoscopic spatial context feature (extracted from the 3-scale pyramid feature), and (iii) temporal feature.

...read moreread less

Journal ArticleDOI

CNN-LRP: Understanding Convolutional Neural Networks Performance for Target Recognition in SAR Images

Bo Zang, +6 more

- 01 Jul 2021 -

Sensors

TL;DR: A novel LRP algorithm particularly designed for understanding CNN’s performance on SAR image target recognition is proposed, providing a concise form of the correlation between output of a layer and weights of the next layer in CNNs.

...read moreread less

Posted Content

Deep Active Learning for Joint Classification & Segmentation with Weak Annotator

Soufiane Belharbi, +3 more

- 10 Oct 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: The results indicate that, by simply using random sample selection, the proposed approach can significantly outperform state-of-the-art CAMs and AL methods, with an identical oracle-supervision budget.

...read moreread less

Journal ArticleDOI

Measuring Visual Walkability Perception Using Panoramic Street View Images, Virtual Reality, and Deep Learning

Yunqin Li, +2 more

- 01 Aug 2022 -

Social Science Research Network

Collapse

References

PDF

Open Access

More filters

Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Proceedings Article

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, +1 more

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Proceedings ArticleDOI

Fully convolutional networks for semantic segmentation

Jonathan Long, +2 more

TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.

...read moreread less

Collapse

arXiv: Learning

Bayesian Attention Modules

Xinjie Fan, +3 more

Hyperbolic Representation Learning for Fast and Efficient Neural Question Answering

Yi Tay, +2 more

- 25 Jul 2017 -

arXiv: Information Retrieval

Axiom-based Grad-CAM: Towards Accurate Visualization and Explanation of CNNs

Citations

Explaining Convolutional Neural Networks through Attribution-Based Input Sampling and Block-Wise Feature Aggregation

A Cascade Attention Based Facial Expression Recognition Network by Fusing Multi-Scale Spatio-Temporal Features

CNN-LRP: Understanding Convolutional Neural Networks Performance for Target Recognition in SAR Images

Deep Active Learning for Joint Classification & Segmentation with Weak Annotator

Measuring Visual Walkability Perception Using Panoramic Street View Images, Virtual Reality, and Deep Learning

References

ImageNet Classification with Deep Convolutional Neural Networks

Very Deep Convolutional Networks for Large-Scale Image Recognition

Very Deep Convolutional Networks for Large-Scale Image Recognition

ImageNet Large Scale Visual Recognition Challenge

Fully convolutional networks for semantic segmentation

Related Papers (5)

Axiom−based Grad−CAM: Towards Accurate Visualization and Explanation of CNNs

Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks

Deep Neural Networks Motivated by Partial Differential Equations

Bayesian Attention Modules

Hyperbolic Representation Learning for Fast and Efficient Neural Question Answering