scispace - formally typeset
Search or ask a question
Author

Jie Ren

Bio: Jie Ren is an academic researcher from Shanghai Jiao Tong University. The author has contributed to research in topics: Computer science & Convolutional neural network. The author has an hindex of 5, co-authored 15 publications receiving 81 citations.

Papers
More filters
Proceedings ArticleDOI
01 Oct 2019
TL;DR: This study proposes to distill knowledge from the CNN into an explainable additive model, which explains the CNN prediction quantitatively, and discusses the problem of the biased interpretation of CNN predictions.
Abstract: This paper presents a method to pursue a semantic and quantitative explanation for the knowledge encoded in a convolutional neural network (CNN). The estimation of the specific rationale of each prediction made by the CNN presents a key issue of understanding neural networks, and it is of significant values in real applications. In this study, we propose to distill knowledge from the CNN into an explainable additive model, which explains the CNN prediction quantitatively. We discuss the problem of the biased interpretation of CNN predictions. To overcome the biased interpretation, we develop prior losses to guide the learning of the explainable additive model. Experimental results have demonstrated the effectiveness of our method.

47 citations

Posted Content
TL;DR: It is proved that some classic methods of enhancing the transferability essentially decease interactions inside adversarial perturbations and proposed to directly penalize interactions during the attacking process, which significantly improves the adversarial transferability.
Abstract: In this paper, we use the interaction inside adversarial perturbations to explain and boost the adversarial transferability. We discover and prove the negative correlation between the adversarial transferability and the interaction inside adversarial perturbations. The negative correlation is further verified through different DNNs with various inputs. Moreover, this negative correlation can be regarded as a unified perspective to understand current transferability-boosting methods. To this end, we prove that some classic methods of enhancing the transferability essentially decease interactions inside adversarial perturbations. Based on this, we propose to directly penalize interactions during the attacking process, which significantly improves the adversarial transferability.

33 citations

Proceedings Article
28 Sep 2020
TL;DR: In this article, the negative correlation between the adversarial transferability and the interaction inside adversarial perturbations was discovered and verified through different DNNs with various inputs, which can be regarded as a unified perspective to understand current transferability-boosting methods.
Abstract: In this paper, we use the interaction inside adversarial perturbations to explain and boost the adversarial transferability. We discover and prove the negative correlation between the adversarial transferability and the interaction inside adversarial perturbations. The negative correlation is further verified through different DNNs with various inputs. Moreover, this negative correlation can be regarded as a unified perspective to understand current transferability-boosting methods. To this end, we prove that some classic methods of enhancing the transferability essentially decease interactions inside adversarial perturbations. Based on this, we propose to directly penalize interactions during the attacking process, which significantly improves the adversarial transferability.

19 citations

Posted Content
TL;DR: A generic method to revise the neural network to boost the challenge of inferring input attributes from features, while maintaining highly accurate outputs is proposed, which significantly diminishes the adversary's ability in inferring about the input while largely preserves the resulting accuracy.
Abstract: Previous studies have found that an adversary attacker can often infer unintended input information from intermediate-layer features. We study the possibility of preventing such adversarial inference, yet without too much accuracy degradation. We propose a generic method to revise the neural network to boost the challenge of inferring input attributes from features, while maintaining highly accurate outputs. In particular, the method transforms real-valued features into complex-valued ones, in which the input is hidden in a randomized phase of the transformed features. The knowledge of the phase acts like a key, with which any party can easily recover the output from the processing result, but without which the party can neither recover the output nor distinguish the original input. Preliminary experiments on various datasets and network structures have shown that our method significantly diminishes the adversary's ability in inferring about the input while largely preserves the resulting accuracy.

16 citations

Posted Content
TL;DR: This paper proposes a generic definition for the feature complexity of a DNN, and designs a set of metrics to evaluate the reliability, the effectiveness, and the significance of over-fitting of these feature components.
Abstract: This paper aims to define, quantify, and analyze the feature complexity that is learned by a DNN We propose a generic definition for the feature complexity Given the feature of a certain layer in the DNN, our method disentangles feature components of different complexity orders from the feature We further design a set of metrics to evaluate the reliability, the effectiveness, and the significance of over-fitting of these feature components Furthermore, we successfully discover a close relationship between the feature complexity and the performance of DNNs As a generic mathematical tool, the feature complexity and the proposed metrics can also be used to analyze the success of network compression and knowledge distillation

13 citations


Cited by
More filters
Posted Content
TL;DR: This paper proposes a simple yet effective Adversarial Weight Perturbation (AWP) to explicitly regularize the flatness of weight loss landscape, forming a double-perturbation mechanism in the adversarial training framework that adversarially perturbs both inputs and weights.
Abstract: The study on improving the robustness of deep neural networks against adversarial examples grows rapidly in recent years. Among them, adversarial training is the most promising one, which flattens the input loss landscape (loss change with respect to input) via training on adversarially perturbed examples. However, how the widely used weight loss landscape (loss change with respect to weight) performs in adversarial training is rarely explored. In this paper, we investigate the weight loss landscape from a new perspective, and identify a clear correlation between the flatness of weight loss landscape and robust generalization gap. Several well-recognized adversarial training improvements, such as early stopping, designing new objective functions, or leveraging unlabeled data, all implicitly flatten the weight loss landscape. Based on these observations, we propose a simple yet effective Adversarial Weight Perturbation (AWP) to explicitly regularize the flatness of weight loss landscape, forming a double-perturbation mechanism in the adversarial training framework that adversarially perturbs both inputs and weights. Extensive experiments demonstrate that AWP indeed brings flatter weight loss landscape and can be easily incorporated into various existing adversarial training methods to further boost their adversarial robustness.

265 citations

Journal ArticleDOI
TL;DR: CA-Net as mentioned in this paper proposes a joint spatial attention module to make the network focus more on the foreground region and a novel channel attention module is proposed to adaptively recalibrate channel-wise feature responses and highlight the most relevant feature channels.
Abstract: Accurate medical image segmentation is essential for diagnosis and treatment planning of diseases. Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance for automatic medical image segmentation. However, they are still challenged by complicated conditions where the segmentation target has large variations of position, shape and scale, and existing CNNs have a poor explainability that limits their application to clinical decisions. In this work, we make extensive use of multiple attentions in a CNN architecture and propose a comprehensive attention-based CNN (CA-Net) for more accurate and explainable medical image segmentation that is aware of the most important spatial positions, channels and scales at the same time. In particular, we first propose a joint spatial attention module to make the network focus more on the foreground region. Then, a novel channel attention module is proposed to adaptively recalibrate channel-wise feature responses and highlight the most relevant feature channels. Also, we propose a scale attention module implicitly emphasizing the most salient feature maps among multiple scales so that the CNN is adaptive to the size of an object. Extensive experiments on skin lesion segmentation from ISIC 2018 and multi-class segmentation of fetal MRI found that our proposed CA-Net significantly improved the average segmentation Dice score from 87.77% to 92.08% for skin lesion, 84.79% to 87.08% for the placenta and 93.20% to 95.88% for the fetal brain respectively compared with U-Net. It reduced the model size to around 15 times smaller with close or even better accuracy compared with state-of-the-art DeepLabv3+. In addition, it has a much higher explainability than existing networks by visualizing the attention weight maps. Our code is available at https://github.com/HiLab-git/CA-Net .

205 citations

Posted Content
01 Jan 1998
TL;DR: An axiomatization of the interaction between the players of any coalition is given based on three axioms: linearity, dummy and symmetry, which gives an expression of the Banzhaf and Shapley interaction indices in terms of pseudo-Boolean functions.
Abstract: An axiomatization of the interactions between the palyers of any coalition is given. It is based on three axioms: linearity, dummy and symmetry. These interaction indices extend the Banzhaf and Shapley values when using in addition two equivalent recursive axioms. Lastly, the authors give an expression of the Banzhaf and Shapely interaction indices in terms of pseudo-Boolean functions.

179 citations

Journal ArticleDOI
TL;DR: This work makes extensive use of multiple attentions in a CNN architecture and proposes a comprehensive attention-based CNN (CA-Net) for more accurate and explainable medical image segmentation that is aware of the most important spatial positions, channels and scales at the same time.
Abstract: Accurate medical image segmentation is essential for diagnosis and treatment planning of diseases. Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance for automatic medical image segmentation. However, they are still challenged by complicated conditions where the segmentation target has large variations of position, shape and scale, and existing CNNs have a poor explainability that limits their application to clinical decisions. In this work, we make extensive use of multiple attentions in a CNN architecture and propose a comprehensive attention-based CNN (CA-Net) for more accurate and explainable medical image segmentation that is aware of the most important spatial positions, channels and scales at the same time. In particular, we first propose a joint spatial attention module to make the network focus more on the foreground region. Then, a novel channel attention module is proposed to adaptively recalibrate channel-wise feature responses and highlight the most relevant feature channels. Also, we propose a scale attention module implicitly emphasizing the most salient feature maps among multiple scales so that the CNN is adaptive to the size of an object. Extensive experiments on skin lesion segmentation from ISIC 2018 and multi-class segmentation of fetal MRI found that our proposed CA-Net significantly improved the average segmentation Dice score from 87.77% to 92.08% for skin lesion, 84.79% to 87.08% for the placenta and 93.20% to 95.88% for the fetal brain respectively compared with U-Net. It reduced the model size to around 15 times smaller with close or even better accuracy compared with state-of-the-art DeepLabv3+. In addition, it has a much higher explainability than existing networks by visualizing the attention weight maps. Our code is available at this https URL

174 citations