A Style-Based Generator Architecture for Generative Adversarial Networks
Tero Karras,Samuli Laine,Timo Aila +2 more
- pp 4396-4405
Reads0
Chats0
TLDR
This paper proposed an alternative generator architecture for GANs, borrowing from style transfer literature, which leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images.Abstract:
We propose an alternative generator architecture for generative adversarial networks, borrowing from style transfer literature. The new architecture leads to an automatically learned, unsupervised separation of high-level attributes (e.g., pose and identity when trained on human faces) and stochastic variation in the generated images (e.g., freckles, hair), and it enables intuitive, scale-specific control of the synthesis. The new generator improves the state-of-the-art in terms of traditional distribution quality metrics, leads to demonstrably better interpolation properties, and also better disentangles the latent factors of variation. To quantify interpolation quality and disentanglement, we propose two new, automated methods that are applicable to any generator architecture. Finally, we introduce a new, highly varied and high-quality dataset of human faces.read more
Citations
More filters
Proceedings ArticleDOI
Dual Attention GANs for Semantic Image Synthesis
TL;DR: A novel Dual Attention GAN (DAGAN) is proposed to synthesize photo-realistic and semantically-consistent images with fine details from the input layouts without imposing extra training overhead or modifying the network architectures of existing methods.
Proceedings ArticleDOI
GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving
Yun Chen,Frieda Rong,Shivam Duggal,Shenlong Wang,Xinchen Yan,Sivabalan Manivasagam,Shangjie Xue,Ersin Yumer,Raquel Urtasun +8 more
TL;DR: GeoSim as discussed by the authors is a geometry-aware image composition process which synthesizes novel urban driving scenarios by augmenting existing images with dynamic objects extracted from other scenes and rendered at novel poses.
Proceedings ArticleDOI
The Role of ImageNet Classes in Fréchet Inception Distance
TL;DR: It is concluded that FID is prone to intentional or accidental distortions, and a case where an ImageNet pre-trained FastGAN achieves a FID comparable to StyleGAN2, while being worse in terms of human evaluation is discussed.
Posted Content
Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion
TL;DR: This paper proposes Blow, a single-scale normalizing flow using hypernetwork conditioning to perform many-to-many voice conversion between raw audio, and shows that Blow compares favorably to existing flow-based architectures and other competitive baselines, obtaining equal or better performance in both objective and subjective evaluations.
Book ChapterDOI
Synthesizing Coupled 3D Face Modalities by Trunk-Branch Generative Adversarial Networks
Baris Gecer,Alexander Lattas,Stylianos Ploumpis,Jiankang Deng,Athanasios Papaioannou,Stylianos Moschoglou,Stefanos Zafeiriou +6 more
TL;DR: This paper presents the first methodology that generates high-quality texture, shape, and normals jointly jointly, which can be used for photo-realistic synthesis and proposes a novel GAN that can generate data from different modalities while exploiting their correlations.