scispace - formally typeset
Open AccessJournal ArticleDOI

Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning

Reads0
Chats0
TLDR
Virtual adversarial training (VAT) as discussed by the authors is a regularization method based on virtual adversarial loss, which is a measure of local smoothness of the conditional label distribution given input.
Abstract
We propose a new regularization method based on virtual adversarial loss: a new measure of local smoothness of the conditional label distribution given input. Virtual adversarial loss is defined as the robustness of the conditional label distribution around each input data point against local perturbation. Unlike adversarial training, our method defines the adversarial direction without label information and is hence applicable to semi-supervised learning. Because the directions in which we smooth the model are only “virtually” adversarial, we call our method virtual adversarial training (VAT). The computational cost of VAT is relatively low. For neural networks, the approximated gradient of virtual adversarial loss can be computed with no more than two pairs of forward- and back-propagations. In our experiments, we applied VAT to supervised and semi-supervised learning tasks on multiple benchmark datasets. With a simple enhancement of the algorithm based on the entropy minimization principle, our VAT achieves state-of-the-art performance for semi-supervised learning tasks on SVHN and CIFAR-10.

read more

Citations
More filters
Journal ArticleDOI

Robust graph convolutional networks with directional graph adversarial training

TL;DR: A graph-specific AT method, Directional Graph Adversarial Training (DGAT), which incorporates the graph structure into the adversarial process and automatically identifies the impact of perturbations from neighbor nodes, and introduces an adversarial regularizer to defend the worst-case perturbation.
Proceedings ArticleDOI

Cross-patch Dense Contrastive Learning for Semi-supervised Segmentation of Cellular Nuclei in Histopathologic Images

TL;DR: The key idea of the method is to align features of teacher and student networks, sampled from cross-image in both patch- and pixel-levels, for enforcing the intra-class compactness and inter-class separability of features that are helpful for extracting valuable knowledge from unlabeled data.
Proceedings Article

Debiased Self-Training for Semi-Supervised Learning

TL;DR: Debiased Self-Training (DST) is proposed, which can be seamlessly adapted to other self-training methods and help stabilize their training and balance performance across classes in both cases of training from scratch and finetuning from pre-trained models.
Proceedings ArticleDOI

Dense Learning based Semi-Supervised Object Detection

TL;DR: This paper proposes a DenSe Learning (DSL) based anchor-free SSOD algorithm, and introduces several novel techniques, including an Adaptive Filtering strategy for assigning multi-level and accurate dense pixel-wise pseudo-labels, an Aggregated Teacher for producing stable and precise pseudo-Labels, and an uncertainty-consistency-regularization term among scales and shuffled patches for improving the generalization capability of the detector.
Posted Content

Multiview Pseudo-Labeling for Semi-supervised Learning from Video

TL;DR: In this paper, a multiview pseudo-labeling approach is proposed for video representation learning, which uses complementary views in the form of appearance and motion information for semi-supervised learning in video.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Journal ArticleDOI

Generative Adversarial Nets

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.
Journal Article

Dropout: a simple way to prevent neural networks from overfitting

TL;DR: It is shown that dropout improves the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.
Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Proceedings ArticleDOI

Densely Connected Convolutional Networks

TL;DR: DenseNet as mentioned in this paper proposes to connect each layer to every other layer in a feed-forward fashion, which can alleviate the vanishing gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters.