Open AccessPosted Content
Axiom-based Grad-CAM: Towards Accurate Visualization and Explanation of CNNs
Reads0
Chats0
TLDR
This paper introduces two axioms -- Conservation and Sensitivity -- to the visualization paradigm of the CAM methods and proposes a dedicated Axiom-based Grad-CAM (XGrad-Cam) that is able to achieve better visualization performance and be class-discriminative and easy-to-implement compared with Grad-cAM++ and Ablation-C AM.Abstract:
To have a better understanding and usage of Convolution Neural Networks (CNNs), the visualization and interpretation of CNNs has attracted increasing attention in recent years. In particular, several Class Activation Mapping (CAM) methods have been proposed to discover the connection between CNN's decision and image regions. In spite of the reasonable visualization, lack of clear and sufficient theoretical support is the main limitation of these methods. In this paper, we introduce two axioms -- Conservation and Sensitivity -- to the visualization paradigm of the CAM methods. Meanwhile, a dedicated Axiom-based Grad-CAM (XGrad-CAM) is proposed to satisfy these axioms as much as possible. Experiments demonstrate that XGrad-CAM is an enhanced version of Grad-CAM in terms of conservation and sensitivity. It is able to achieve better visualization performance than Grad-CAM, while also be class-discriminative and easy-to-implement compared with Grad-CAM++ and Ablation-CAM. The code is available at this https URL.read more
Citations
More filters
Posted Content
Explaining Convolutional Neural Networks through Attribution-Based Input Sampling and Block-Wise Feature Aggregation
Sam Sattarzadeh,Mahesh Sudhakar,Anthony Lem,Shervin Mehryar,Konstantinos N. Plataniotis,Jongseong Jang,Hyunwoo Kim,Yeonjeong Jeong,Sangmin Lee,Kyunghoon Bae +9 more
TL;DR: This work collects visualization maps from multiple layers of the model based on an attribution-based input sampling technique and aggregate them to reach a fine-grained and complete explanation, and proposes a layer selection strategy that applies to the whole family of CNN-based models.
Journal ArticleDOI
A Cascade Attention Based Facial Expression Recognition Network by Fusing Multi-Scale Spatio-Temporal Features
TL;DR: This paper proposes a cascade attention-based facial expression recognition network on the basis of a combination of (i) local spatial feature, (ii) multi-scale-stereoscopic spatial context feature (extracted from the 3-scale pyramid feature), and (iii) temporal feature.
Journal ArticleDOI
CNN-LRP: Understanding Convolutional Neural Networks Performance for Target Recognition in SAR Images
TL;DR: A novel LRP algorithm particularly designed for understanding CNN’s performance on SAR image target recognition is proposed, providing a concise form of the correlation between output of a layer and weights of the next layer in CNNs.
Posted Content
Deep Active Learning for Joint Classification & Segmentation with Weak Annotator
TL;DR: The results indicate that, by simply using random sample selection, the proposed approach can significantly outperform state-of-the-art CAMs and AL methods, with an identical oracle-supervision budget.
References
More filters
Proceedings Article
ImageNet Classification with Deep Convolutional Neural Networks
TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.
Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.
Proceedings Article
Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan,Andrew Zisserman +1 more
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Journal ArticleDOI
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Aditya Khosla,Michael S. Bernstein,Alexander C. Berg,Li Fei-Fei +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Proceedings ArticleDOI
Fully convolutional networks for semantic segmentation
TL;DR: The key insight is to build “fully convolutional” networks that take input of arbitrary size and produce correspondingly-sized output with efficient inference and learning.