Ablation-CAM: Visual Explanations for Deep Convolutional Network via Gradient-free Localization
TL;DR: This approach – Ablation-based Class Activation Mapping (Ablation CAM) uses ablation analysis to determine the importance of individual feature map units w.r.t. class to produce a coarse localization map highlighting the important regions in the image for predicting the concept.
Abstract: In response to recent criticism of gradient-based visualization techniques, we propose a new methodology to generate visual explanations for deep Convolutional Neural Networks (CNN) - based models. Our approach – Ablation-based Class Activation Mapping (Ablation CAM) uses ablation analysis to determine the importance (weights) of individual feature map units w.r.t. class. Further, this is used to produce a coarse localization map highlighting the important regions in the image for predicting the concept. Our objective and subjective evaluations show that this gradient-free approach works better than state-of-the-art Grad-CAM technique. Moreover, further experiments are carried out to show that Ablation-CAM is class discriminative as well as can be used to evaluate trust in a model.
Cites background or methods from "Ablation-CAM: Visual Explanations f..."
...Besides, they also break the axiom of implementation invariance since they are layer sensitive ....
..., Grad-CAM , Grad-CAM++  and Ablation-CAM )....
... proposed Ablation-CAM to remove the dependence on gradients but this method is quite time-consuming since it has to run forward propagation for hundreds of times per image....
...Note that the original weight of each feature map in Ablation-CAM  is defined as Sc(F )−Sc(F\F) ||Flk|| ....
...This definition is inspired by CAM  and further improved by other works, such as Grad-CAM++  and Ablation-CAM ....
Cites background from "Ablation-CAM: Visual Explanations f..."
...They can be divided into two branches, one is gradient-based CAMs , , which represent the linear weights corresponding to internal activation maps by gradient information....
...As the output layer is a non-linear function, gradient-based CAMs tend to diminish the backpropagating gradients which cause gradient saturation thereby making it difficult to provide concrete explanations....
...These categories are known as Class Activation Maps (CAMs)....
...The other is gradient-free CAMs ,  which capture the importance of each activation map by the target score in forward propagation....
...The generalisation of CAMs take place with Grad-CAM ....