scispace - formally typeset
Open AccessProceedings ArticleDOI

A Three-Player GAN: Generating Hard Samples to Improve Classification Networks

Reads0
Chats0
TLDR
A Three-Player Generative Adversarial Network to improve classification networks is proposed and it is found that the model is able to synthesize realistically looking examples that are hard for the classification model.
Abstract
We propose a Three-Player Generative Adversarial Network to improve classification networks. In addition to the game played between the discriminator and generator, a competition is introduced between the generator and the classifier. The generator's objective is to synthesize samples that are both realistic and hard to label for the classifier. Even though we make no assumptions on the type of augmentations to learn, we find that the model is able to synthesize realistically looking examples that are hard for the classification model. Furthermore, the classifier becomes more robust when trained on these difficult samples. The method is evaluated on a public dataset for traffic sign recognition.

read more

Content maybe subject to copyright    Report

A Three-Player GAN: Generating Hard Samples To Improve
Classification Networks
Simon Vandenhende, Bert De Brabandere, Davy Neven and Luc Van Gool
KU Leuven
ESAT-PSI, Belgium
firstname.lastname@esat.kuleuven.be
Abstract
We propose a Three-Player Generative Adversarial
Network to improve classification networks. In addi-
tion to the game played between the discriminator and
generator, a competition is introduced between the gen-
erator and the classifier. The generator’s objective is to
synthesize samples that are both realistic and hard to la-
bel for the classifier. Even though we make no assump-
tions on the type of augmentations to learn, we find
that the model is able to synthesize realistically look-
ing examples that are hard for the classification model.
Furthermore, the classifier becomes more robust when
trained on these difficult samples. The method is eval-
uated on a public dataset for traffic sign recognition.
1 Introduction
Deep convolutional neural networks have brought
significant progress to the area of computer vi-
sion. However, training the models still requires vast
amounts of data. As intelligent vision systems are
being deployed in increasingly dynamic environments,
collecting the necessary data becomes a tedious task.
Recent work in generative modeling, based on Gen-
erative Adversarial Networks (GANs) [14], allows to
efficiently synthesize novel samples that belong to the
data distribution. GANs derive the data distribution
from an adversarial game, played between two entities:
the generator G synthesizes new samples, and the dis-
criminator D tries to separate real samples from the
ones synthesized by G. The goal of the generator is
to confuse D so that it cannot discriminate between
real and fake examples. The game ends when the two
players are at a Nash equilibrium.
GANs prove useful to improve the performance of
classification networks. For example, [5] proposes an
adversarial approach which jointly optimizes the data
augmentation and a network for pose estimation. The
generator learns to synthesize augmentations from the
training data that are hard to label for the classifi-
cation network. The augmentations are composed of
rotations, scaling transformations and occlusions.
Furthermore, [69] have successfully employed
GANs in a semi-supervised learning setting. In [6], the
discriminator learns the classification task from unla-
beled data. The discriminator has to classify each sam-
ple into a chosen number of categories. Since the condi-
tional distribution p (c|x) is unknown, a goodness of fit
measure is included to ensure correspondence between
the categories and the class labels. [8] trains a classi-
fier in a semi-supervised manner by considering images
from the GAN as samples from an additional class. [9]
trains the classifier and the generative model simulta-
neously. They find that both generator and classifier
represent a conditional distribution between labels and
images. This observation leads to a compatibility cri-
terion between the generator and classifier.
Our work implements a three-player adversarial
game in which the classification network participates.
The generator adapts itself to both the discriminator
and classifier. This allows the generator to estimate the
distribution of samples that are hard to label correctly
for the classifier. In contrast to [5], our work does not
restrict the type of augmentations that can be learned.
Also, the proposed method simply relies on backprop-
agation, which makes it a very general approach. We
show that the three-player game can improve classifi-
cation networks, when annotated data is scarce. The
proposed method is evaluated on CURE-TSR [10], a
publicly available dataset for traffic sign recognition.
2 Method
A regular Generative Adversarial Network [1] com-
prises a min-max game, played between the discrim-
inator D and generator G. Additionally, we now in-
troduce a competition between the generator and clas-
sifier. The objective for G changes from synthesizing
images that are realistic, to generating images that are
both realistic and challenging for the classification net-
work.
As before, the discriminator is trained to predict
whether a sample is real or fake. The generator, in
turn, optimizes the sum of two losses. The first term
is the regular GAN loss, provided by the discrimina-
tor. In order for the generator to compete with the
classifier, the second loss term needs to be chosen ap-
propriately. To this end, backpropagation should yield
the maximization of the classification model’s loss, on
samples from G. This encourages G to move towards
the distribution of samples that confuse the classifier.
The classifier is trained by minimizing the classifica-
tion loss on samples from G. The game is played by
arXiv:1903.03496v1 [cs.CV] 8 Mar 2019

Generator Discriminator
Classifier
Class
Noise
Source
Loss
Class
Loss
Backward Pass
Gradient Reversal
Layer
Lambda
Figure 1: Setup for the three-player game. Images
from the generator are propagated through both the
discriminator D and the classifier C. The gradient that
is backpropagated through D proceeds as usual. The
gradient that is backpropagated through C is rescaled
and inverted as λ
θ
C
L
C
. The loss from D penalizes
G for synthesizing unrealistic samples, while the in-
verted loss from C rewards G for synthesizing difficult
samples.
updating all three models one after another.
Inspired by [11], the objective for the second loss
term, seen by G, is realized by implementing a gradi-
ent reversal layer between the generator and classifier.
During the forward pass, samples from G are simply
passed to the classification network. When backpropa-
gating the classification loss, the sign of the gradient is
reversed, causing the update in G to maximize the clas-
sification loss. This technique is related to [12], which
finds adversarial examples by applying perturbations
that lie along directions where the classification loss is
likely to increase. The setup of our system is shown in
figure 1.
The three-player GAN shows some similarities with
auxiliary classifier GANs (ACGANs) [13]. In the AC-
GAN model, the discriminator categorizes the images
in addition to predicting their source. This allows the
discriminator to be deployed as a classification model.
There are two main differences with our approach.
First, in the three-player game, the generator tries to
maximize the classification loss rather than minimize
it. The focus of this work is on the generation of hard
samples. Secondly, the three-player GAN separates the
network architecture of the discriminator and classi-
fier. This allows to specialize the architecture of the
discriminator and classifier for their respective tasks.
The complete training procedure for the three-player
game is defined in algorithm 1. A hyperparameter λ is
introduced to weigh the classification loss against the
discriminative loss.
Algorithm 1: The three-player GAN
for number of training iterations do
Sample a batch (x
g
, y
g
) of size m from the
generator, and a batch (x, y) of size m from
the training data.
Update the discriminator by ascending its
stochastic gradient:
θ
d
"
1
m
X
(x,y)
log D (x, y)
+
1
m
X
(x
g
,y
g
)
log (1 D (x
g
, y
g
))
#
Sample a batch (x
g
, y
g
) of size m from the
generator.
Update the generator by descending its
stochastic gradient:
θ
g
"
1
m
X
(x
g
,y
g
)
log (1 D (x
g
, y
g
))
λ
θ
c
1
m
X
(x
g
,y
g
)
L
C
(x
g
, C (y
g
))
#
Sample a batch (x, y) of size m.
Update the classifier by descending its
stochastic gradient:
θ
c
1
m
X
(x,y)
L
c
(x, C (y))
end
3 Experiments
We first consider a toy example, which demonstrates
that the three-player game acts as a regularizer for the
decision surface of the classifier. In the second part
we evaluate our method on CURE-TSR [10]. Both ex-
periments compare the performance of a classification
network trained through the three-player game against
several other training scenarios.
3.1 Training details
We initialize the discriminator and generator in
the three-player game by training a conditional GAN.
When updating the classifier, we sample batches con-
taining both real images, images synthesized by the
initial generator and images synthesized by the cur-
rent generator. The samples from the initial generator

(a) Trained on real samples. (b) Trained on real and synthesized
samples.
(c) Trained on real, synthesized and
difficult samples.
Figure 2: Decision surface of the classification model at the end of different training procedures.
serve to avoid catastrophic forgetting of examples that
are difficult early on.
The learning rates and the weighing parameter λ are
updated according to the scheme from [6],
λ =
2 · w
c
1 + exp (10 · p)
1, µ =
µ
0
(1 + α · p)
β
with p the training progress growing linearly from 0
to 1, α = 10, β = 0.75, w
c
= 0.1 and µ
0
the initial
learning rate. The value of w
c
is chosen smaller than
one to ensure that synthesizing realistic samples has
priority over synthesizing difficult ones. The weighing
parameter λ gradually grows during training, allowing
the generator to come up with difficult samples even
when the classification model becomes better.
3.2 Toy example
We demonstrate that the three-player GAN effec-
tively acts as a regularizer, by means of a toy ex-
ample. Consider the case where samples from two
classes need to be separated. Both classes are dis-
tributed as two-dimensional Gaussians, parameterized
by µ
X
= ±1, µ
Y
= ±1 and σ
X
= σ
Y
= 0.5. The train-
ing data consists of eight examples per class, drawn as
dots and crosses in figure 2. The classifier, represented
as a simple linear mapping, is trained using a hinge
loss.
A baseline classifier and conditional GAN are
trained using the available training examples. A sec-
ond classification model is trained on a combination of
real and synthesized samples. Thirdly, we also train
a classification model based on the three-player game.
For this particular example, we initialize the classifier
as the baseline model and freeze its parameters. The
game is played for a few epochs, allowing the generator
to estimate the distribution of samples that are difficult
for the baseline model. The parameters of the classi-
fication model were initialized randomly. To ensure
a fair comparison, we made sure that the classification
model uses the same initial weights. Figure 2 shows the
decision boundary of the classification models obtained
by different training schemes. Through comparison we
find that the three-player GAN is able to regularize the
decision surface.
Consider again the two Gaussian distributions from
before, but with an increased variance. When sampling
from the two classes, we find that the distributions
show a significant overlap near the origin. If the three-
player game behaves as intended, we expect the gen-
erator to synthesize samples which lie near the origin.
We train a classification model and a conditional GAN
by sampling from the two Gaussian distributions. Af-
terwards, the generator is updated through the three-
player game in order to synthesize difficult samples.
Figure 3 shows the results. We find that the genera-
tor learns to synthesize samples at locations where the
classifier has a hard time.
3.3 CURE-TSR
The CURE-TSR dataset [10] is composed of both
real and simulated images of 14 traffic sign classes un-
der various weather conditions. The set of simulated
traffic sign instances is considered here under the fol-
lowing conditions: clear weather, low-mid-high levels
of snow, low-mid-high levels of rain and low-mid-high
levels of dark weather. For the training (resp. valida-
tion) set we took the first 100 (resp. last 50) images per
class from each weather condition. Since the data con-
tains sequences of images for which the camera gradu-
ally moves closer to the traffic sign, the data selection
comes down to using only a few of such sequences.
Again, we train a classifier and conditional GAN us-
ing the available training data. A second classification

model is trained using both real samples and samples
synthesized by the conditional GAN. Thirdly, we com-
pare with an auxiliary classifier GAN. Finally, a clas-
sifier is learned by means of the three-player game. As
mentioned in section 3.1, the discriminator and gener-
ator are initialized as the models from the conditional
GAN.
(a) The true distribution consists of two classes, both dis-
tributed as two-dimensional Gaussians.
(b) Distribution learned by generator during the three-
player game. The generator synthesizes samples located
in the area where the two class distributions overlap.
Figure 3: The real distribution consists of two classes
which partially overlap. Both are two-dimensional
Gaussians of which the mean is indicated as a dot.
The circles in the figure correspond to multiples of the
standard deviation. We find that the three-player GAN
learns to synthesize samples at locations where the two
classes overlap. These are samples which are hard to
label correctly for a classification model.
Figure 4: Images generated during the three-player
game.
The network architecture for the classifier is based
on one column from the multi-column deep neural net-
work used for traffic sign recognition in [14]. The dis-
criminator and generator networks are based on ear-
lier work [2]. More details can be found in the sup-
plemental materials. The classification network was
trained for 150 epochs with an Adam Optimizer [15]
(µ
0
= 0.001, β
1
= 0.5, β
2
= 0.999). The learning rate
is degraded by a factor 10 every 60 epochs. A weight
decay term of 1e 4 is included in the classifica-
tion loss. The conditional GAN was trained for 500
epochs using batches of size 64. For the auxiliary clas-
sifier GAN, we reused the architecture and training
scheme from the original work [13]. We used an Adam
Optimizer with the same learning parameters as [16]
(µ
0
= 0.0002, β
1
= 0.0, β
2
= 0.9). The initial learning
rates for the three-player game are the same as for the
other training strategies. The results can be found in
table 1. We find that training the classification model
by means of the three-player game improves the test
accuracy. Figure 4 shows images that were generated
during the three-player game.
Table 1: Accuracy on the CURE-TSR test set for dif-
ferent training schemes.
Method Without real data With real data
Baseline - 83.38
cGAN 77.08 83.23
ACGAN - 79.23
Threeplayer 79.83 85.41
4 Conclusion
We have proposed an effective, yet simple method
to improve classification networks, by having a genera-
tive model synthesize difficult samples. The method is

based on a regular GAN game, but includes an adver-
sarial loss which steers the generator towards difficult
samples. In comparison to previous work, we do not
restrict, nor limit the kind of augmentations that the
generative model can learn. We find that the gener-
ative model is able to synthesize realistically looking
images which are hard to label correctly for the clas-
sification model. Since our method simply relies on
backpropagation, future research can look whether the
idea also applies to different tasks.
Acknowledgement: The work was supported by
Toyota, and was carried out at the TRACE Lab at
KU Leuven (Toyota Research on Automated Cars in
Europe - Leuven).
5 Supplemental Materials
Network Architectures - CURE
Table 2: Discriminator
Operation Features Output size
48 x 48 input 3
ResNet block 48 24 x 24
ResNet block 96 12 x 12
ResNet block 192 6 x 6
ResNet block 384 3 x 3
ReLU
Sum Pool 384 1 x 1
Linear 1
Filter size 3 x 3
Initialization Xavier -
2
Table 3: Generator
Operation Features Output size
100 x 1 noise
Linear 384 3 x 3
ResNet block 384 6 x 6
ResNet block 192 12 x 12
ResNet block 96 24 x 24
ResNet block 48 48 x 48
BatchNorm
ReLU
Convolution 3 48 x 48
Filter size 3 x 3
Initialization Xavier -
2
Table 4: Classifier
Operation Features Kernel Nonlinearity
Convolution 100 7 x 7 ReLU
Convolution 150 4 x 4 ReLU
Convolution 250 4 x 4 ReLU
Linear 300 ReLU
Linear 43
Initialization Xavier -
2
Batch normalization after each convolution
Dropout (p = 0.5) in linear layers
References
[1] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu,
D. Warde-Farley, S. Ozair, A. Courville, and Y. Ben-
gio, “Generative adversarial nets,” in Advances in neu-
ral information processing systems, 2014, pp. 2672–
2680.
[2] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida,
“Spectral normalization for generative adversarial net-
works,” arXiv preprint arXiv:1802.05957, 2018.
[3] H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena,
“Self-attention generative adversarial networks,” arXiv
preprint arXiv:1805.08318, 2018.
[4] A. Brock, J. Donahue, and K. Simonyan, “Large scale
gan training for high fidelity natural image synthesis,”
arXiv preprint arXiv:1809.11096, 2018.
[5] X. Peng, Z. Tang, F. Yang, R. S. Feris, and D. Metaxas,
“Jointly optimize data augmentation and network train-
ing: Adversarial data augmentation in human pose es-
timation,” in Proceedings of the IEEE Conference on
Computer Vision and Pattern Recognition, 2018, pp.
2226–2234.
[6] J. T. Springenberg, “Unsupervised and semi-supervised
learning with categorical generative adversarial net-
works,” arXiv preprint arXiv:1511.06390, 2015.
[7] A. Odena, “Semi-supervised learning with generative
adversarial networks,” arXiv preprint arXiv:1606.01583,
2016.
[8] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung,
A. Radford, and X. Chen, “Improved techniques for
training gans,” in Advances in Neural Information Pro-
cessing Systems, 2016, pp. 2234–2242.
[9] C. Li, K. Xu, J. Zhu, and B. Zhang, “Triple generative
adversarial nets,” arXiv preprint arXiv:1703.02291, 2017.
[10] D. Temel, G. Kwon, M. Prabhushankar, and G. Al-
Regib, “Cure-tsr: Challenging unreal and real envi-
ronments for traffic sign recognition,” arXiv preprint
arXiv:1712.02463, 2017.
[11] Y. Ganin and V. Lempitsky, “Unsupervised domain
adaptation by backpropagation,” arXiv preprint arXiv:1409.7495,
2014.
[12] I. Goodfellow, J. Shlens, and C. Szegedy, “Explaining
and harnessing adversarial examples,” 2015.
[13] A. Odena, C. Olah, and J. Shlens, “Conditional image

Citations
More filters
Proceedings ArticleDOI

Synthesizing Diverse Lung Nodules Wherever Massively: 3D Multi-Conditional GAN-Based CT Image Augmentation for Object Detection

TL;DR: The results show that 3D Convolutional Neural Network-based detection can achieve higher sensitivity under any nodule size/attenuation at fixed False Positive rates and overcome the medical data paucity with the MCGAN-generated realistic nodules–even expert physicians fail to distinguish them from the real ones in Visual Turing Test.
Proceedings ArticleDOI

Learning More with Less: Conditional PGGAN-based Data Augmentation for Brain Metastases Detection Using Highly-Rough Annotation on MR Images

TL;DR: This paper proposes Conditional Progressive Growing of GANs (CPGGANs), incorporating highly-rough bounding box conditions incrementally into PGGANs to place brain metastases at desired positions/sizes on 256 × 256 Magnetic Resonance (MR) images, for Convolutional Neural Network-based tumor detection; this first GAN-based medical DA using automatic boundingbox annotation improves the training robustness.
Journal ArticleDOI

Tripartite-GAN: Synthesizing liver contrast-enhanced MRI to improve tumor detection.

TL;DR: A Tripartite Generative Adversarial Network (Tripartite-GAN) is proposed as a non-invasive, time-saving, and inexpensive clinical tool by synthesizing CEMRI to detect tumors without CA injection and achieves high-quality CemRI synthesis that peak signal-to-noise rate and accurate tumor detection that accuracy of 89.4%, which reveals that Tripartites can aid in the clinical diagnosis of liver tumors.
Posted Content

Combining Noise-to-Image and Image-to-Image GANs: Brain MR Image Augmentation for Tumor Detection

TL;DR: This work thoroughly investigates CNN-based tumor classification results and proposes a two-step GAN-based DA that can significantly outperform the classic DA alone, in tumor detection and also in other medical imaging tasks.
Journal Article

Model-Based Robust Deep Learning: Generalizing to Natural, Out-of-Distribution Data

TL;DR: The objective is to provide general training algorithms that can be used to train deep neural networks to be robust against natural variation in data, and to exploit such models in three novel model-based robust training algorithms in order to enhance the robustness of deep learning with respect to the given model.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Journal ArticleDOI

Generative Adversarial Nets

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Proceedings Article

Explaining and Harnessing Adversarial Examples

TL;DR: It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.
Proceedings Article

Large Scale GAN Training for High Fidelity Natural Image Synthesis

TL;DR: It is found that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input.
Proceedings Article

Improved techniques for training GANs

TL;DR: In this article, a variety of new architectural features and training procedures are applied to the generative adversarial networks (GANs) framework and achieved state-of-the-art results in semi-supervised classification on MNIST, CIFAR-10 and SVHN.
Related Papers (5)
Frequently Asked Questions (2)
Q1. What have the authors contributed in "A three-player gan: generating hard samples to improve classification networks" ?

The authors propose a Three-Player Generative Adversarial Network to improve classification networks. In addition to the game played between the discriminator and generator, a competition is introduced between the generator and the classifier. Furthermore, the classifier becomes more robust when trained on these difficult samples. 

Since their method simply relies on backpropagation, future research can look whether the idea also applies to different tasks.