Semantic Image Synthesis With Spatially-Adaptive Normalization
Taesung Park,Ming-Yu Liu,Ting-Chun Wang,Jun-Yan Zhu +3 more
- pp 2337-2346
Reads0
Chats0
TLDR
S spatially-adaptive normalization is proposed, a simple but effective layer for synthesizing photorealistic images given an input semantic layout that allows users to easily control the style and content of image synthesis results as well as create multi-modal results.Abstract:
We propose spatially-adaptive normalization, a simple but effective layer for synthesizing photorealistic images given an input semantic layout. Previous methods directly feed the semantic layout as input to the network, forcing the network to memorize the information throughout all the layers. Instead, we propose using the input layout for modulating the activations in normalization layers through a spatially-adaptive, learned affine transformation. Experiments on several challenging datasets demonstrate the superiority of our method compared to existing approaches, regarding both visual fidelity and alignment with input layouts. Finally, our model allows users to easily control the style and content of image synthesis results as well as create multi-modal results. Code is available upon publication.read more
Citations
More filters
Journal Article
“Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告
Posted Content
Self-Attention Generative Adversarial Networks
TL;DR: Self-Attention Generative Adversarial Network (SAGAN) as mentioned in this paper uses attention-driven, long-range dependency modeling for image generation tasks and achieves state-of-the-art results.
Proceedings ArticleDOI
Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?
TL;DR: This work proposes an efficient algorithm to embed a given image into the latent space of StyleGAN, which enables semantic image editing operations that can be applied to existing photographs.
Posted Content
Taming Transformers for High-Resolution Image Synthesis
TL;DR: It is demonstrated how combining the effectiveness of the inductive bias of CNNs with the expressivity of transformers enables them to model and thereby synthesize high-resolution images.
Posted Content
StarGAN v2: Diverse Image Synthesis for Multiple Domains
TL;DR: StarGAN v2, a single framework that tackles image-to-image translation models with limited diversity and multiple models for all domains, is proposed and shows significantly improved results over the baselines.
References
More filters
Posted Content
Large Scale GAN Training for High Fidelity Natural Image Synthesis
TL;DR: BigGAN as mentioned in this paper applies orthogonal regularization to the generator, allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the generator's input, leading to models which set the new state of the art in class-conditional image synthesis.
Proceedings Article
Self-Attention Generative Adversarial Networks
TL;DR: The proposed SAGAN achieves the state-of-the-art results, boosting the best published Inception score from 36.8 to 52.52 and reducing Frechet Inception distance from 27.62 to 18.65 on the challenging ImageNet dataset.
Proceedings ArticleDOI
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
TL;DR: AttnGAN as mentioned in this paper proposes an attentional generative network to synthesize fine-grained details at different sub-regions of the image by paying attentions to the relevant words in the natural language description.
Proceedings ArticleDOI
Scene completion using millions of photographs
James Hays,Alexei A. Efros +1 more
TL;DR: A new image completion algorithm powered by a huge database of photographs gathered from the Web, requiring no annotations or labelling by the user, that can generate a diverse set of results for each input image and allow users to select among them.
Book ChapterDOI
Unified Perceptual Parsing for Scene Understanding
TL;DR: A multi-task framework called UPerNet and a training strategy are developed to learn from heterogeneous image annotations and it is shown that it is able to effectively segment a wide range of concepts from images.