scispace - formally typeset
Open AccessProceedings ArticleDOI

A Style-Based Generator Architecture for Generative Adversarial Networks

Tero Karras, +2 more
- pp 4396-4405
Reads0
Chats0
TLDR
This paper proposed an alternative generator architecture for GANs, borrowing from style transfer literature, which leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images.
Abstract
We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images (e.g., freckles, hair), and it enables intuitive, scale-specific control of the synthesis. The new generator improves the state-of-the-art in terms of traditional distribution quality metrics, leads to demonstrably better interpolation properties, and also better disentangles the latent factors of variation. To quantify interpolation quality and disentanglement, we propose two new, automated methods that are applicable to any generator architecture. Finally, we introduce a new, highly varied and high-quality dataset of human faces.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Dual Attention GANs for Semantic Image Synthesis

TL;DR: A novel Dual Attention GAN (DAGAN) is proposed to synthesize photo-realistic and semantically-consistent images with fine details from the input layouts without imposing extra training overhead or modifying the network architectures of existing methods.
Proceedings ArticleDOI

GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving

TL;DR: GeoSim as discussed by the authors is a geometry-aware image composition process which synthesizes novel urban driving scenarios by augmenting existing images with dynamic objects extracted from other scenes and rendered at novel poses.
Proceedings ArticleDOI

The Role of ImageNet Classes in Fréchet Inception Distance

TL;DR: It is concluded that FID is prone to intentional or accidental distortions, and a case where an ImageNet pre-trained FastGAN achieves a FID comparable to StyleGAN2, while being worse in terms of human evaluation is discussed.
Posted Content

Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion

TL;DR: This paper proposes Blow, a single-scale normalizing flow using hypernetwork conditioning to perform many-to-many voice conversion between raw audio, and shows that Blow compares favorably to existing flow-based architectures and other competitive baselines, obtaining equal or better performance in both objective and subjective evaluations.
Book ChapterDOI

Synthesizing Coupled 3D Face Modalities by Trunk-Branch Generative Adversarial Networks

TL;DR: This paper presents the first methodology that generates high-quality texture, shape, and normals jointly jointly, which can be used for photo-realistic synthesis and proposes a novel GAN that can generate data from different modalities while exploiting their correlations.
Related Papers (5)