Explaining and Harnessing Adversarial Examples
Citations
7,027 citations
Cites background from "Explaining and Harnessing Adversari..."
...Past work has observed that ReLU neural networks are locally almost linear (Goodfellow et al., 2015), which suggests that second derivatives may be close to zero in most cases, partially explaining the good perforFigure 4....
[...]
...Past work has observed that ReLU neural networks are locally almost linear (Goodfellow et al., 2015), which suggests that second derivatives may be close to zero in most cases, partially explaining the good perfor- Figure 4....
[...]
5,789 citations
Cites background or methods from "Explaining and Harnessing Adversari..."
...Due to the growing body of work on adversarial examples (Gu & Rigazio, 2014; Fawzi et al., 2015; Torkamani, 2016; Papernot et al., 2016; Carlini & Wagner, 2016a; Tramèr et al., 2017b; Goodfellow et al., 2014; Kurakin et al., 2016), we focus only on the most related papers here....
[...]
...Unfortunately, ERM often does not yield models that are robust to adversarially crafted examples (Goodfellow et al., 2014; Kurakin et al., 2016; Moosavi-Dezfooli et al., 2016; Tramèr et al., 2017b)....
[...]
...For instance, the `∞-ball around x has recently been studied as a natural notion for adversarial perturbations (Goodfellow et al., 2014)....
[...]
...While trained models tend to be very effective in classifying benign inputs, recent work (Szegedy et al., 2013; Goodfellow et al., 2014; Nguyen et al., 2015; Sharif et al., 2016) shows that an adversary is often able to manipulate the input so that the model produces an incorrect output....
[...]
...On the attack side, prior work has proposed methods such as the Fast Gradient Sign Method (FGSM) and multiple variations of it (Goodfellow et al., 2014)....
[...]
5,782 citations
Additional excerpts
...15 Adversarial misclassification example [81]...
[...]
...[81] generate adversarial examples to improve performance on the MNIST classification task....
[...]
4,505 citations
3,114 citations
Cites background or methods from "Explaining and Harnessing Adversari..."
...Prior work on adversarial sample crafting against DNNs developed a simple technique corresponding to the Architecture and Training Tools threat model, based on gradients used for DNN training [17], [30], [35]....
[...]
...Furthermore, adding adversarial samples to the training set can act like a regularizer [17]....
[...]
...introduced a system that generates adversarial samples by perturbing inputs in a way that creates source/target misclassifications [17], [35]....
[...]
...Previous work explored DNN properties that could be used to craft adversarial samples [17], [30], [35]....
[...]
...most components used to build DNNs [17]....
[...]