Semantic Image Synthesis With Spatially-Adaptive Normalization
Citations
3,940 citations
2,106 citations
851 citations
Additional excerpts
...texture synthesis [18, 33, 28], video generation [31, 30], image-to-image translation [11, 36, 1, 24] and object detection [19]....
[...]
744 citations
Cites background from "Semantic Image Synthesis With Spati..."
...This inductive bias towards local interactions thus leads to efficient computations, but the wide range of specialized layers which are introduced into CNNs to handle different synthesis tasks [46, 70, 59, 74, 73] suggest that this bias is often too restrictive....
[...]
...Dataset ours SPADE [46] Pix2PixHD (+aug) [65] CRN [9]...
[...]
...2 (where we compare to [46, 65, 31, 9]) and (ii) unconditional face synthesis in Tab....
[...]
697 citations
Cites background from "Semantic Image Synthesis With Spati..."
...Other approaches produce various outputs with the guidance of reference images [4, 5, 26, 32]....
[...]
References
123,388 citations
111,197 citations
Additional excerpts
...We use the ADAM [21] and set β1 = 0, β2 = 0.999....
[...]
...We use the ADAM [21] and set β1 = 0, β2 = 0....
[...]
73,978 citations
38,211 citations
"Semantic Image Synthesis With Spati..." refers background in this paper
...ontrollable, diverse outputs as shown in Figure1. 2. Related Work Deep generative models can learn to synthesize randomly sampled images. Recent methods include generative adversarial networks (GANs) [12] and variational autoencoder (VAE) [22]. Our work is built on GANs but aims for the conditional image synthesis task. The GANs consist of a generator and a discriminator where the goal of the generato...
[...]
30,843 citations
"Semantic Image Synthesis With Spati..." refers background or methods in this paper
...Unconditional normalization layers have been an im portant component in modern deep networks and can be found in various classifier designs, including the Focal Re sponse Normalization (FRN) in the AlexNet [29] and Batch Normalization (BN) in the Inception-v2 network [21], Figure 2: In SPADE, the mask is first projected onto an em bedding space, and then convolved to produce the modula tion parameters 7 and (3....
[...]
...In contrast to BatchNorm [19], they depend on the input segmentation mask and vary with respect to the location (y, x)....
[...]
...Similar to Batch Normaliza tion [21], the activation is normalized in the channel-wise manner, and then modulated with learned scale and bias....
[...]
...y (m) in (1) are the learned modulation parameters of the normalization layer In contrast to BatchNorm [21], they depend on the input segmentation mask and vary with respect to the location (y, x)....
[...]