Open AccessPosted Content
Ensembling with Deep Generative Views
Reads0
Chats0
TLDR
In this article, the authors use StyleGAN2 as the source of GAN-based augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars, and find that while test-time ensembling with GANbased augmented images can offer some small improvements, the remaining bottlenecks are the efficiency and accuracy of the GAN reconstructions, coupled with classifier sensitivities to artifacts in GANgenerated images.Abstract:
Recent generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose, simply by learning from unlabeled image collections. Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification. Using a pretrained generator, we first find the latent code corresponding to a given real input image. Applying perturbations to the code creates natural variations of the image, which can then be ensembled together at test-time. We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars. Critically, we find that several design decisions are required towards making this process work; the perturbation procedure, weighting between the augmentations and original image, and training the classifier on synthesized images can all impact the result. Currently, we find that while test-time ensembling with GAN-based augmentations can offer some small improvements, the remaining bottlenecks are the efficiency and accuracy of the GAN reconstructions, coupled with classifier sensitivities to artifacts in GAN-generated images.read more
Citations
More filters
Posted Content
Generative Models as a Data Source for Multiview Representation Learning
TL;DR: In this paper, the authors compare several representation learning methods that can be applied to this setting, using the latent space of the generator to generate multiple "views" of the same semantic content.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Journal ArticleDOI
ImageNet Large Scale Visual Recognition Challenge
Olga Russakovsky,Jia Deng,Hao Su,Jonathan Krause,Sanjeev Satheesh,Sean Ma,Zhiheng Huang,Andrej Karpathy,Aditya Khosla,Michael S. Bernstein,Alexander C. Berg,Li Fei-Fei +11 more
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Posted Content
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.
Dissertation
Learning Multiple Layers of Features from Tiny Images
TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.
Proceedings ArticleDOI
Mask R-CNN
TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation, which extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.
Related Papers (5)
Data augmentation for handwritten digit recognition using generative adversarial networks
Ganesh Jha,Hubert Cecotti +1 more